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1 Introduction 



Many economic and financial appfications involve time series data with autocorrelation and 
heteroskedasticity properties. Often the unknown dependence structure is not the chief object 
of interest but the inference on the parameter of interest involves the estimation of unknown 
dependence. In stationary time series models estimated by generalized method of moments 
(GMM), robust inference is typically accomplished by consistently estimating the asymptotic 
covariance matrix, which is proportional to the long run variance (LRV) matrix of the estimat- 
ing equations or moment conditions defining the estimator, using a kernel smoothing method. 
In the econometric and statistical literature, the bandwidth parameter /truncation lag involved 
in the kernel smoothing method is assumed to grow slowly with sample size in order to achieve 
consistency. The inference is conducted by plugging in a covariance matrix estimator that is 
consistent under heteroskedasticity and autocorrelation. This approach dates back to Newey 
and West (1987) and Andrews (1991). Recently, Kiefer and Vogelsang (2005) (KV, hereafter) 
developed an alternative first order asymptotic theory for the HAC (heteroskedasticity and au- 
tocorrelation consistent) based robust inference, where the proportion of the bandwidth involved 
in the HAC estimator to the sample size T, denoted as b, is held fixed in the asymptotics. Un- 
der the fixed-5 asymptotics, the HAC estimator converges to a nondegenerate yet nonstandard 
limiting distribution. The tests based on the fixed-6 asymptotic approximation were shown to 
enjoy better finite sample properties than the tests based on the small-6 asymptotic theory under 
which the HAC estimator is consistent and the limiting distribution of the studentized statistic 
admits a standard form, such as standard normal or distribution. Using the higher-order 
Edgeworth expansions, Jansson (2004), Sun et al. (2008) and Sun (2010) rigorously proved that 
the fixed-6 asymptotics provides a high order refinement over the traditional small-6 asymptotics 
in the Gaussian location model. Sun et al. (2008) also provided an interesting decision theo- 
retical justification for the use of fixed-6 rules in econometric testing. For non-Gaussian linear 
processes, Gongalves and Vogelsang (2011) obtained an upper bound on the convergence rate 
of the error in the fixed-6 approximation and showed that it can be smaller than the error of 
the normal approximation under suitable assumptions. 

Since the seminal contribution by KV (2005), there has been a growing body of work in 
econometrics and statistics to extend and expand the fixed-6 idea in the inference for time 
series data. For example. Sun (2011a) developed a procedure for hypothesis testing in time 
series models by using the nonparametric series method. The basic idea is to project the time 
series onto a space spanned by a set of fourier basis functions [see Phillips (2005) for an early 
development] and construct the covariance matrix estimator based on the projection vectors 
with the number of basis functions held fixed. Also see Sun (2011b) for the use of a similar 
idea in the inference of the trend regression models. Ibragimov and Miiller (2010) proposed a 
subsampling based t-statistic for robust inference where the unknown dependence structure can 
be in the temporal, spatial or other forms. In their paper, the number of non-overlapping blocks 
is held fixed. The t-statistic based approach was extended by Bester et al. (2009) to the inference 
of spatial and panel data with group structure. In the context of misspecification testing, Chen 
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and Qu (2010) proposed a modified M test of Kuan and Lee (2006) which involves dividing the 
full sample into several recursive subsamples and constructing a normalization matrix based 
on them. In the statistical literature, Shao (2010) developed the self-normalized approach to 
inference for time series data that uses an inconsistent long run variance estimator based on 
recursive subsample estimates. The self- normalized method is an extension of Lobato (2001) 
from the sample autocovariances to more general approximately linear statistics and it coincides 
with KV's fixed-6 approach in the inference of the mean of a stationary time series by using 
the Bartlett kernel and letting 6=1. Although the above inference procedures are proposed in 
different settings and for different problems and data structure, they share a common feature 
in the sense that the underlying smoothing parameters in the asymptotic covariance matrix 
estimator such as the number of basis functions, the number of cluster groups and the number of 
recursive subsamples, play a similar role as the bandwidth in the HAC estimator. Throughout 
the paper, we shall call these asymptotics, where the smoothing parameter (or function of 
smoothing parameter) is held fixed, the fixed-smoothing asymptotics. In contrast, when the 
smoothing parameter grows with respect to sample size, we use the term increasing-domain 
asymptotics. At some places the terms fixed- (or fixed-6) and increasing- ii' (or small-6) 
asymptotics are used to follow the convention in the literature. 

In this article, we make several methodological and theoretical contributions to the fixed- 
smoothing literature. First, we propose a general class of estimators for estimating the LRV ma- 
trix in the inference of stationary time series models estimated by GMM. Our proposal includes 
the traditional lag window type (or HAC) covariance estimator, the projection-based covariance 
estimator, the cluster-based covariance estimator and the blockwise recursive subsampling-based 
covariance estimator as special cases. The general covariance estimator considered here involves 
projecting the original data onto a space spanned by a sequence of basis functions (not nec- 
essarily orthogonal), where the number of basis functions K plays a key role in determining 
asymptotic properties of the estimator. Under the fixed- i^T asymptotics, we show that the Wald 
statistic based on the general LRV estimator converges to an (approximate) F distribution with 
a scale constant depending only on K and the number of restrictions being tested. Thus our re- 
sult provides a unification of the various recently proposed fixed-smoothing inference procedures 
in the first order sense. 

Second, we derive a higher order expansion of the distribution of subsampling t-statistic 
when the underlying smoothing parameter K is held fixed, under the framework of the Gaussian 
location model. Specifically, we show that the error in the rejection probability (ERP, hereafter) 
is of order 0(1 /T) under the fixed- -ftT asymptotics. These results are similar to those obtained 
under the fixed-6 asymptotics [see Sun et al. (2008)], but are stronger in the sense that we are 
able to derive the exact form of the leading error term with order 0(1/T). Building on the 
new technical arguments used in proving expansion results for the subsampling t-statistic, we 
further study the expansion of the distribution of the Wald statistic with the HAC covariance 
estimator. Under the assumption that the eigenfunctions of the kernel in the HAC estimator 
have zero mean and other mild assumptions, we derive the leading error term of order 0{1/T) 
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under a fixed-6 framework. The explicit form of the leading error term in the approximation 
provides a clear theoretical explanation for the empirical findings in the literature regarding the 
direction and magnitude of size distortion for time series with various degrees of dependence. 
To the best of our knowledge, this is the first time that the leading error terms are made explicit 
through the higher-order Edgeworth expansion under the fixed-smoothing asymptotics. 

Third, we propose a novel bootstrap method for time series, the Gaussian dependent boot- 
strap, which is able to mimic the second order properties of the original time series and produce 
a Gaussian bootstrap sample. For the Gaussian location model, we show that the inference 
based on the Gaussian dependent bootstrap is more accurate than the first order approxima- 
tion under the fixed-smoothing asymptotics. This seems to be the first time a bootstrap method 
is shown to be second order correct under the fixed-smoothing asymptotics; see Gongalves and 
Vogelsang (2011) for a recent attempt for the moving block bootstrap in the non-Gaussian set- 
ting. Fourth, we provide some simulation results that clearly demonstrate the effectiveness of 
Gaussian dependent bootstrap and the relevance of our higher order theory. 

We now introduce some notation. For a vector x = (xi,X2, . . . jXy^) £ W^", we let ||x|| = 
(X;-=ixf)V2 be the Euclidean norm. For a matrix A = {aij)j'^j^^ £ M^o^'Jo^ denote by ||^||2 = 
suppii^i \ \Ax\\ the spectral norm and ||A||oo = maxi<jj<gQ \aij\ the max norm. Denote by [aj 
the integer part of a real number a. Let L^[0, 1] be the space of square integrable functions on 
[0, 1]. Denote by D[0, 1] the space of functions on [0, 1] which are right continuous and have left 
limits, endowed with the Skorokhod topology [see Billingsley (1999)]. Denote by " " weak 
convergence in the M'"' -valued function space L>9o[0, 1], where qo G N. Denote by " -^'^ " and 
" — " convergence in distribution and convergence in probability, respectively. The notation 
N{fi, S) is used to denote the multivariate normal distribution with mean /i and covariance S. 
Let Xk ^ random variable following distribution with k degrees of freedom and Gk be the 
corresponding distribution function. 

The layout of the paper is as follows. Section 2 describes the GMM framework and some 
high level assumptions. Section 3 presents a general class of estimators for estimating the 
asymptotic covariance matrix of the GMM estimator. We study the first order fixed-smoothing 
asymptotics in Section 4. Section 5 contains the higher order expansions of the finite sample 
distributions of the subsampling t-statistic and the Wald statistic with the HAG estimator. We 
introduce the Gaussian dependent bootstrap and the results about its second order accuracy in 
Section 6. Section 7 concludes. Technical details are gathered in the appendix. 

2 Basic setup and assumptions 

In linear and nonlinear models with moment conditions, it is standard to employ GMM 
[Hansen (1982)] to estimate the model parameters. We follow the GMM setup as described 
in KV (2005). Gonsider a d x I vector of parameters G C R"^ of interest, where is the 
parameter space. Denote Oq the true parameter of 9 which is an interior point of 0. Let yt 
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denote a vector of observed data and assume the moment conditions 

E[fiyt,e)] = 0, t = l,2,...,T (1) 

hold if and only if ^ = 6q, where /(•) is m x 1 vector of functions with m > d and 
iank{E[df{yt,6())/d9']) = d. When m > d, the parameter 6 is over-identified with the de- 
gree of over-identification v = m — d. Define the partial sum gt{0) = Sj=i fiVji Then 
the GMM estimator of Oq is given by 

Ot = aigmmg^QgT{0)'WTgT{0), (2) 
where Wt is a m x m semi-positive definite weighting matrix. Further define 

Using the mean value theorem, we have griGT) = griGo) + Gt{Ot){Gt — Sq), where 9t is a value 
between 9q and 6t- Note that GT{OT)'WTgT{&T) = by the first order condition, which implies 
that 

Gt(^t)'VFt5t(^o) + GT{eT)'WTGT{eT){eT - Oo) = GT{OT)'WTgT{OT) = 0. 

Solving the above equation, we have 

T^I'^iOT - 9o) = -{GT{eT)'WTGT{BT)y^GT{eT)'WT{T^/'^gT{Bo))- 

To derive the asymptotic distribution of 6t, we make the following high-level assumptions as 
KV (2005) and Sun (2010). 

Assumption 2.1. Or -^^ 6q. 

Assumption 2.2. T^/'^g^'r^^{9Q) =^ AWm.{r) where 



AA' = n= E[f{yt,eo)f{yt-j,eo)l 



j=-oo 

and Wmir) is a m— dimensional vector of independent standard Brownian motions. 

Assumption 2.3. Gt{9t) -^^ Go uniformly for all 9t between 9t and 9o where Gq = 
E[dfiy„9o)/d9']. 

Assumption 2.4. The weighting matrix Wt is symmetric and semi-positive definite such that 
Wt -^^ Wq and GqWqGo is positive definite. 

Under Assumptions 2.1-2.4, it is easy to see that 

T'^\eT - 9o) -{G'^WoGor^G'^WoAWmil) N{{),Vq), 
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where " = " denotes "equal in distribution" and the asymptotic covariance matrix Vq := 
{G'qWqGo)'^ GqWo^WoGo{G'qWoGo)~^ . To make inference on 6q, we have to estimate Go, Wq 
and the LRV matrix il. Under the above assumptions, Go and Wq can be consistently estimated 
by their sample counterparts Gt{Ot) and Wt respectively. It remains to estimate the LRV 
matrix il. In the next section, we introduce a general class of estimators for and Vq. 

3 LRV estimators 

To present the idea, we focus on the hypothesis testing problem that Hi^q : r(0o) = versus 
the alternative that Hi^a ■ f{Go) 7^ 0, where r{6) is a p x 1 continuously differentiable function 
with the first order derivative matrix R{9) = dr{9)/d9' and p < d. Let 

= (GT(^T)'VFrGr(^T))"^(GT(^T)'WTOTWTGr(^T))(GT(^T)'WTGT(^T))"\ 

be an estimator of Vq, where CIt is the LRV estimate of 0. The Wald statistic for testing Hi^ 
against Hi^a is defined as 

Ft = Trier)' D-^r{eT)/p, (3) 
where Dt = R{9t)VtR{9t)' ■ The widely used lag window type LRV estimator is given by 

= T^Y.^ [-gr) f(y- ^T)f{y„ Ot)', (4) 
i=l j=l ^ ^ 

where /C(-) is a kernel function and h is the proportion of the truncation lag to the sample size. 
By setting 

Ui = R{eT){GT{eT)'WTGT{eT)r^GT{eT)'WTf{yjT). 

we have 

When /C(-) is semi-positive definite^, by Mercer's theorem, we have the spectral decomposition, 

+00 

nr - t) = J2>^M^)Mt)^ < r,t < 1/6, (5) 



bivariate function g{r, s) : R x M — J- M is called semi-positive definite if for any positive integer n, 
we have Yl'^j^i CiCjg{ai, aj) > for all (oi, 02, . . . , an) and (ci, C2, . . . , c„) in M". Here we assume that 
/C(r — s) = g{r, s) is semi-positive definite. 
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where and are the eigenvalues and orthonormal eigenfunctions corresponding to the 
kernel function respectively. We thus have the representation, 

with K = +00. In the traditional asymptotics, b goes to zero as T increases which is referred as 
the small-6 asymptotics. When b G (0, 1] is held fixed, it corresponds to the fixed-6 asymptotics 
in KV (2005). As pointed out in some recent studies [see e.g., Bester et al. (2009); Sun (2011a; 
2011b); Chen and Qu (2010)], K can also be held as a fixed positive integer, which can lead 
to a more accurate first order approximation. In light of these recent findings, we introduce 
a general class of estimators to estimate the LRV matrix. With a slight abuse of notation, 
we let {(/)s(t)}^^ be a sequence of linearly independent functions'^ in L^[0, 1/6] and {Xj} be a 
sequence of nonnegative weights such that X^jLi = 1- Note that Aj's in (5) are nonnegative 
when we consider semi-positive definite kernels in (4). Further let Vg = -^Yli=i {w) ''^i^ 
be the normalized inner product between {ui}J^i and {0s(i/(6T))}^]^. Define R = {Rij)fj^i 
with Rij = <f>i{t/b)4)j{t/b)dt, where (psit/b) = 0s(t/6) - 4>s{t/b)dt, and L = {Lij)f-j^^ an 
upper triangular matrix based on the Cholesky decomposition of R~^, i.e., L' L = R~^ . Define 
V = {Vi,Vi,...,Vi,)' and 

V* = {V,*',Vf, V^y = (L C5 Ip)V, 
where V* = XljLi ^ij^j 1 < i < K. Then the general LRV estimator is given by 

K T T f K K f ■ \ ^ f ■ \ ^ 

dt=yi ^sv:v:' = ^ E E E ^« E ^--^^ E ^^^^^ [if]] (6) 

s=l i=l j=l ls=l m=l ^ 1=1 ^ ^ J 

and the test statistic based on the general LRV estimator is defined as. 

Ft = [Vfr{eT)]'D^'[Vfr{eT)]/p. (7) 

The matrix R is introduced for orthogonalization so that the limiting distribution of the test 
statistic Ft does not depend on the basis functions. Note that the choice of R is not unique 
(See Example 3.3). In what follows, we shall show that the recently developed nonparametric 
series covariance estimator [Sun (2011a; 2011b)], the recursive subsampling-based covariance 
estimator [Chen and Qu (2010)] and the cluster covariance estimator (CCE) [Bester et al. 
(2009)] are all special cases of the general LRV estimator. Throughout Examples 3.1-3.3, we 
set 6 = 1 and Xj = 1/K for j = 1, 2, . . . , K. 

Example 3.1. Let {(f)s{t)}^=i be a sequence of orthonormal basis functions with (f)s{t)dt = 0. 

set of elements {V-'i}i£i in ^ real valued vector space is called linearly independent if and only if 
SiLi '^ii^i = =^ fli = for i = 1,2, . . . ,K. Here denotes the null element in the vector space. 



7 



Then we have R = IrxK and Dt = '^Ylf=i^j^j ^ where Vj = -^Yli=i 4'j{'^/T)ui. When 
(j)s{t) = \/2 sin(27rst) (or (j)s{t) = -v/2 cos(27rst) ), s = 1,2, . . . , K, it is straightforward to see that 
the LEV estimator corresponds to the series estimator considered in Sun (2011a; 2011b). In 
this case, the LEV estimator involves projecting the data onto a set of orthonormal basis and 
using the sample variance of the projection vectors, namely Dt- 

Example 3.2. For any fixed K with K < T, we consider the basis function (j)s{t) = I{0 < t < 
s/{K + 1)}, s = 1,2, . . . ,K, where I denotes the indicator function. Simple calculation gives us 
Rij = lo Ut)4>j{t)dt = mm{i,j)/{K + 1) - {ij)/{K + 1)2,^ and Dt = ^ Ef=i where 

/ I 7T Lif+lJ ; L K+1 



i=i + - i=i 



with s = 1,2, . . . , K and Vk+i = 0. Therefore, the general LEV estimator reduces to the recursive 
sub sampling-based estimator in Chen and Qu (2010), where the idea is to divide the full sample 
into K +1 recursive subsamples and construct a normalization matrix based on the subsamples. 

Example 3.3. Let {Aj}^^^ be a partition of the unit intervals [0, 1] with K > p. Suppose Aj 
is a finite union of disjoint intervals in [0, 1]. Let (j)s{t) = I(i G Ag), s = 1,2, . . . , K . If we set 
Rij = 4'i{t)4)j(t)dt, then L = diag{l/ ^/\Ai\, 1/^J\A2\, ■ ■ ■ , 1/^/\Ak\), where \A\ denotes the 
Lebesgue measure of the set A. Further assume \Ai\ = \A2\ = • • • = \Ak\ = 1/K, then we have 

^ T T K ^ T T 

Dt =— XI XI XI -^(^/-^ ^ ^s)i{j/T G As)uiUj = j; X X ^^^'-^ ^ group)uiUj, 

1=1 j=l s=l i=l j=l 

where i is in group s if and only if i/T E As,s = 1,2, ... ,K. In this case, the general LEV 
estimator is the same as the CCE considered in Bester et al. (2009), where the idea is to 
utilize the group structure in the observations and construct a covariance estimator based on the 
parameter estimates in each group. Using similar arguments in Sun (2010), we can show that 

-= X ^ ASp(r), 
1=1 

where A is an invertible matrix such that 

AA' = R{9o)iG'oWoGor^G'oWonWoGo{G'oWoGo)-^R'{9o) 
and Bp{r) denotes a p-dimensional vector of independent Brownian bridges. It implies that 
1 ^ u.^'^A [ dB,{r) =^ -^A{Z, - Z), 

idsth group " 



^In this case, we have La — \l = ^^^+1"*' ^^^d = otherwise. 
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and 

1 ^ 

s=l 

where {Z[, Z'2, . . . , Z'j^)' ~ A^(0, Ik Ip) and Z = J2f=i ^s/K- When p = 1, it is well known 
that 



which implies y/Fr -^'^ \J \'^K-i\ under the Hi^. Note that y^^^Fr coincides with the 
subsampling-based t-statistic in Ibragimov and Miiller (2010) when we consider a location model 
and r{6Q) = 6q — 6* for a specific value 9*. When p > 1, we have Ft — j^^Fp K-p- It is 
worth noting that the choice of R = (Rij) with Rij = (j)i{t)(f)j{t)dt is also valid. In this case, 
the limiting distribution of Ft would be a scaled F distribution with p numerator and K —p+1 
denominator degrees of freedom [see Theorem 4.I]. 

Remark 3.1. For the subsampling-based inference, Assumption 2.2 can be re- 
laxed by the assumptions which guarantee the finite dimensional convergence of 
^ YliiGG • • • ) /} YliiGC ) • Here Qi is the set index for the ith group and | • | denotes 



'\Gi\ 

the cardinality. When heteroscedasticity is present across different groups, the t-statistic tends 
to be conservative [see Ibragimov and Miiller (2010)]. 



4 First order fixed-smoothing asymptotics 

In what follows, we consider the first order fixed-smoothing asymptotics of the test statistic 
Ft based on the general LRV estimator under the null hypothesis and local alternatives. To 
emphasize the dependence on the smoothing parameter we shall use the notation Ft{K) 
instead of Ft- 

Theorem 4.1. Suppose p < K < 00 and b G (0, 1] are both fixed. Let R = {Rij)f^j^-^ with Rij = 
Jq <f>i{t/b)(f>j{t/b)dt in the general LRV estimator. Further assume that (f>j{t) is continuously 
differentiable almost everywhere for j = 1,2, . . . ,K. Under Assumptions 2.1-2.4 and Hi^q, we 
have 

Ft{K) -f'' Qp^K ■■= U'pD^^Up/p, (8) 

where Dp = ^f^i ^ji^Vp {^jljLi and Up are independent and identically distributed (iid) as 
N{0, Ip). In particular, if Xj = 1/K for j = 1,2, . . . , K, we get 

Ft{K) -.'^ ^^^Fp,K-p^,. (9) 

Remark 4.1. When the weights Aj's are not equal and p = 1, Dp is a weighted sum of inde- 
pendent Xi random variables. The limiting null distribution Qp^K can be further approximated 
by a scaled F distribution with the parameters chosen properly to match the first two moments 
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[see Sun (2010)]. Compared to Sun (2011a), we do not make the assumption that Jq (j)i{t)dt = 
and we allow the basis functions to be non-orthonormal (see Example 3.2). It is also worth 
noting that the above results hold when (j)s{t) = I(t G As) with Ag being a finite union of 
disjoint intervals in [0, 1]. 

Theorem 4.2. Consider the local alternatives ^ : r{9o) = c/VT with c ^ gW. Under 
the same assumptions in Theorem 4-1 with Xj = 1/K, we have 

Ft{K) K -p+ i^P.-^-P+i.<='(^(«o)VoiJ(eo)')-ic> 

where Fa^^s denotes the noncentral F distribution with degrees of freedom a and b, and noncentral 
parameter 5. 

The theorem shows that the test Ft{K) has non-trivial power against the local alternatives 
of order 1/\/T and it is seen to be consistent if ||c|| — t- +oo as T — t- +oo. When b is fixed 
and K satisfies 1/K + K/T — t- 0, we can show that the general LRV estimator is consistent 
and the limiting distribution of Ft{K) is x^/p- Since the main focus of this article is on the 
fixed-smoothing asymptotics (i.e., K is fixed), we do not present the proof but would expect 
the argument to be similar to Sun (2011a). 

5 Higher order expansions 

This paper is partially motivated by recent studies on the ERP for the Gaussian location 
model by Jansson (2004) and Sun et al. (2008), who showed that the ERP is of order 0(1/T) 
under the fixed-6 asymptotics, which is smaller than the ERP under the small-6 asymptotics. 
A natural question is to what extent the ERP result can be extended to the above-mentioned 
methods in Section 1 under the fixed-smoothing asymptotics. Following Jansson (2004) and 
Sun et al. (2008), we focus on the inference of the mean of a univariate Gaussian stationary 
time series or equivalently a Gaussian location model. We expect that the higher order terms 
in the asymptotic expansion under the Gaussian assumption will also show up in the general 
expansion without the Gaussian assumption. 

5.1 Expansion for the finite sample distribution of subsampling- 
based t-statistic 

In this part, we investigate the Edgeworth expansion of the distribution of the subsampling- 
based t-statistic [Ibragimov and Miiller (2010)]. Here we treat the subsampling-based t-statistic 
and other cases separately, because the t-statistic corresponds to a different choice of orthogo- 
nalization matrix R as explained in Example 3.3. Given the observations X = (Xi, • • • > -'^r); 
we divide the sample into K approximately equal sized groups of consecutive observations. The 
observation Xi is in the j-th group if and only if i G = {s G Z : (j — 1)T/K < s < 



10 



jT/K}, j = 1,2, . . . , K. Assume that the time series {Xi} is stationary and Gaussian with 
mean fi and autocovariance function 7x(* — j) = E[{Xi — ^){Xj — /i)]. Define the sample mean 
of the fc-th group as 

'^fc = T^IZ^^' k = l,2,...,K. 

Let (i = {fii,il2,...,fiKy, P-n = -kYJLih and 5^ = - /^n)^- Then the 

subsampling-based t-statistic for testing the nuh hypothesis i?2,o ■ 1^ = l^o versus the alter- 
native H2^a : A* 7^ ^0) is given by 

VKjfln - Mo) y/Kjfin - f^o) 

s "7 K 

Our goal here is to develop an Edgeworth expansion of P{\Tk\ < x) when K is fixed and sample 
size T —7- oo. Denote by a random variable following t distribution with k degrees of freedom. 
The following theorem gives the higher order expansion under the Gaussian assumption. 

Theorem 5.1. Assume that {Xt} is a stationary Gaussian time series with autocovariance 
function {'yx{h)} satisfying that := ^X=^ao^x^) > and ^h=-oo f^'^\lx{h)\ < oo. Further 
suppose that \Qi\ = \Q2\ = • • • = \Gk\ and K is fixed. Then under -^2,0; we have 

sup \P{\Tk\<x)-^k{x)\ = 0{1/T^), (11) 

a:6[0,+oo) 



where ^ k{x) = P(|ti^_i| < x) — j^^T{x,K) with 



T{x, K)=- K^P{\tK-i\ <x) + {K + l)E 



andB = Y.l=-^\h\lx{h) 



,2 ri ( 

K -1 



Xk-iGi 



E 



2^ ({K-l)xl 



Prom the above expression, we see that the leading error term is of order 0(1/T) and the 
magnitude and direction of the error depend upon Bja^, which is related to the second order 
properties of time series, and T(x,-ftr), which is independent of the dependence structure of 
{Xt\ and can be approximated numerically for given x and K. Figure 1 plots the approximated 
values of T(tif_i(l — a), K)/K for different K and a, where tK~\{l — OL) denotes the 100(1 — a)% 
quantile of the t distribution with K — \ degrees of freedom. It can be seen from Figure 1 that 
T(t_ft-_i(l — a),K)/K increases rapidly for K < \Q and it becomes stable for relatively large 
K. For each K > 2, T(ti^„i(l — a),K)/K is an increasing function of a. In the simulation 
work of Ibragimov and Miiller (2010) (see Figure 2 therein), they found that the size of the 
subsampling based t-test is relatively robust to the correlations if K is small (say K = A va. 
their simulation). This finding is in fact supported by our theory. For K < 4, the magnitude 
of T(x,K) is rather small, so the leading error term is small across a range of correlations. 
As K increases, the first order approximation deteriorates, which is reflected in the increasing 



11 



magnitude of T(t/^_i(l — a), K) with respect to K. 

Notice that T(t/^_i(l — a),K) is always positive and o"^ > by assumption, so the sign 
of the leading error term, i.e., — ^^j, T{x, K) is determined by B. When i? > (e.g., AR(1) 
process with positive coefficient), the first order based inference tends to be oversized and 
conversely it tends to be undersized when B < (e.g., MA(1) process with negative coefficient). 
Some simulations for AR(1) and MA(1) models in the Gaussian location model support these 
theoretical findings. We decide not to report these results to conserve space. Given the sample 
size T, the size distortion for the first order based inference may be severe if the ratio B/a'^ is 
large. For example, this is the case for AR(1) model, Xt = pXt-i +et, as the correlation p gets 
closer to 1. As indicated by Figure 1, we show in the following proposition that T(t/^_i(l — 
a),K)/K converges as K ^ oo. 

Proposition 5.1. As K ^ +oo, we have T{x,K)/K = 2x^G\{x^) + 0{l/K), for any fixed 




Figure 1: Simulated values of T(t/^_i(l — a),K)/K based on 500,000 replications. 

Under the local alternative H2 ^ : p = po + {6a) / \fT with 5 7^ 0, we can derive a similar 
expansion for Tk with K fixed. Formally let Z he a random variable following the standard 
normal distribution and Sk-i = ^ ^k-\I ~ ^) with the Xk^i distribution being independent 
with Z. Then the quantity tK-1,5 = {Z + 6)/Sk-i follows a noncentral t distribution with 
noncentral parameter 5. Define ei(x) = ii^[I{|tx-i.(5| > x}Z'^] and e2{x) = E[I{\tx^i^s\ > 
-'^iX/f-i]- Then under the local alternative, we have 

P{\Tk\ <x) = P{\tK-i,s\ <x)- ^T5(x, K) + 0(l/r2), 

where Ts{x,K) = K'^P{\tK^i,s\ > x) - ei{x) - {K + l)e2(x). For fixed 8, P{\tK~iA > 
ti^_i(l — a)) is a monotonic increasing functions of K. Unreported numerical study shows 
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that Ts{tK^i{l — a),K) is roughly monotonic with respect to K ior 6 £ (0,4], which suggests 
that larger K tends to deliver more power when B > 0. Combined with the previous discussion, 
we see that the choice of K leads to a trade-off between the size distortion and power loss. 

Remark 5.1. Theorem 5.1 gives the ERP and the exact form of the leading error term under 
the fixed- if asymptotics. The high order expansion derived here is based on an expansion of the 
density function of (fii, . . . , jlx) which is made possible by the Gaussian assumption. Expansion 
for a distribution function or equivalently characteristic function has been used in the high order 
expansion of the finite sample distribution under the Gaussian assumption [see e.g., Velasco and 
Robinson (2001), and Sun et al. (2008)]. With K fixed in the asymptotics, the variance of the 
LRV estimator is captured by the first order fixed- i^T limiting distribution and the bias of the 
LRV estimator is reflected in the higher order term — ^^y T(a;, K). 

Remark 5.2. When the number of groups K grows slowly with the sample size T, the Edge- 
worth expansion for Tk was developed for P{Tk < x) in Lahiri (2007; 2010) under the general 
non-Gaussian setup. The expansion given here is different from the usual Edgeworth expansion 
under the increasing-smoothing asymptotics in terms of the form and the convergence rate. 
Using the same argument, we can show that under the fixed- IT asymptotics, the leading error 
term in the expansion of P{Tk < x) is of order 0(1/T) under the Gaussian assumption. In the 
non-Gaussian case, we conjecture that the order of the leading error term is 0(1/ Vt), which 
is due to the effect of the third and fourth order cumulants. 

The higher order Edgeworth expansion results in Sun et al. (2008) suggest that the fixed-6 
based approximation is a refinement of the approximation provided by the limiting distribution 
derived under the small-fe asymptotics. In a similar spirit, it is natural to ask if the fixed- -fC based 
approximation refines the first order approximation under the increasing-i^' asymptotics. To 
address this question, we consider the expansion under the increasing-smoothing asymptotics, 
where K grows slowly with the sample size T. 

Proposition 5.2. Under the same conditions in Theorem 5.1 but with limT~^ooi^/K+K/T) = 
0, we have 

P{\Tk\ <x) = G,{x') + ^J-^x'G'({x') - ^x'G[{x') + 0{1/T). (12) 
Remark 5.3. Since 

P{\tK-i\ <x) = Gi{x^) + -L-x'GUx') + 0{l/K^) 

A — i 

[see e.g., Sun(2011a)], we know that the fixed-X based approximation captures the first two 
terms in (12), whereas the increasing-X based approximation (i.e., Xi) only captures the first 
term. In view of Proposition 5.1, it is not hard to see that 

^k{x) = Gi(x2) + -^^x'G'iix^) - ^x^G\{x^) + 0(1/^2) ^ o(l/r), 
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which imphes that the fixed-K based expansion is able to capture all the three error terms in (12) 
as the smoothing parameter A' — )• oo with T^/^ = o{K). Loosely speaking, this suggests that 
the fixed- IT based expansion holds for a broad range of K and it gets close to the corresponding 
increasing- IT based expansion when K is large. 

5.2 Fixed-6 expansion (with K = +00) 

Consider a semi-positive definite bivariate kernel Q{-,-) which satisfies the spectral decom- 
position 

+00 

e?(r,t) =Y,^j(^j{r)<Pj{t), < r,t < 1, (13) 

i=i 

where are the eigenfunctions and {Xj} are the eigenvalues which are in a descending 

order, i.e., Ai > A2 > • • • > 0. Here we set 6 = 1 for the convenience of presentation. See 
Remark 5.4 for the case b G (0,1). Define the projection vectors = Xl^i 
with <p^(t) = (j)j(t) — Y Yli=i <^i(V^) for J = Ij 2, • • • + 00. Here the dependence of on T is 
suppressed to simplify the notation. Following Sun (2011a), we limit our attention to the case 
(l)j{t)dt = (e.g., fourier basis and Haar wavelet basis)"^. Then the LRV estimator can be 
written as 



j=i j=\ ^ ^ i=\ 

where is the sample mean. Again we focus on the hypothesis testing problem (-f^2,o versus 
H2,a)- Define a sequence of random variables 

Ft{J)= /\ , J = l,2,...,oo 

with ,^0 = Yli=i{Xi — /Uo). Our test statistic is Ft{oo) = ^q/Dt- Let {vi}f^ be a sequence 
of iid standard normal random variables. Further define J-oo{v) = '\ — j and 

-, +00 

V't(^) =^ E (^^^(^^') - ^[(^' - ^n^oo{v) < x}]. (14) 

1=0 



Given any semi-positive definite kernel Q{-, •), we can define the demeaned kernel, 

g{r,t)^g{r,t)~ f g{s,t)ds- f g{r,p)dp+ f f g{s,p)dsdp. 

Jo Jo Jo Jo 

Suppose g{-, •) admits the spectral decomposition g{r,t) = ^i4>i{f)4'i{t) with and {A^} being 

the eigenfunctions and eigenvalues respectively. Notice that 

+00 /pi \ 2 

4>^{t)dt] =0, 



„1 „1 +00 , „i 

/ / g{r,t)drdt ^y^\ 
Jo Jo \Jo 



which implies Jq (f>i{t)dt ~ whenever A; > 0, i.e., the eigenfunctions of the demeaned kernel g are all 



so 

mean zero. 
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The following theorem establishes the asymptotic expansion of the distribution of Ft (oo). 
Theorem 5.2. Assume the kernel Q{-,-) satisfies the following conditions: 

(2) 

(1) Suppose the second derivatives of the eigenf unctions {(f)\ j'^i exist. Further assume that the 
eigenf unctions are mean zero and satisfy that supi<j< j sup^gjo,!] < CJ^ , for j = 0,1,2, 
J € N, and some constant C which does not depend on j and J;^ 

(2) The eigenvalues A„ = 0(l/n"), for some a > 19. 

Under the assumption that {Xt} is a stationary Gaussian time series with > and 

J2h=-oo^'^\'yxW\ ^ ^^'^ hypothesis H2fl, we have sup2.g[o,+oo) IV't(2;)| = 0{1/T) 

and 

sup \P{Ft{oo) <x)- P{FM <x)-i;T{x)\=o{l/T). (15) 

a:6[0,+oo) 

Assume Q{r,r)dr = X]j=i '^i ~ ^- seen from Theorem 5.2, the bias of the LRV 
estimator (i.e., X^^i Aj(var((^j) — o"^)) is reflected in the leading error term iprix), which is a 
weighted sum of the relative difference of var(^j) and cr^. Note that the difference var(^j) — o"^ 
relies on the second order properties of the time series and the eigenfunctions of t/(-, •), and the 
weight E[{vf — l)I{J^oo(i') < x}] which depends on the eigenvalues of G{-, •) is of order 0(Aj), 
as seen from the arguments used in the proof of Theorem 5.2. 

Remark 5.4. In the appendix (see Lemma 8.6), we establish the higher order expansion for 
the Wald statistic based on the general LRV estimators considered in Section 3. This result 
can be viewed as a special case of Theorem 5.2 when the kernel function belongs to a finite 
dimensional space. For < 6 < 1, we define Gbi',') = G{-/b,-/b). If is semi-positive 

definite on [0, 1/6]^, it satisfies the spectral decomposition Qi,{r,t) = ^j,b4'j,b{'<')4'j,b{i) with 

< r,t < 1^. Our result can then be extended to the case where 6 < 1 if the assumptions in 
Theorem 5.2 hold for {Xj.b} and It is worth noting that our result is established under 

different assumptions as compared to Theorem 6 in Sun et al. (2008), where the bivariate 
kernel is defined as Q{r,t) = JC{r — t) and the technical assumption b < 1/(16 \IC{r)\dr) is 
required, which rules out the case 6=1 for most kernels. Here we provide an alternative way 
of proving the 0{1/T) ERP when the eigenfunctions are mean zero. Furthermore, we provide 
the exact form of the leading error term which has not been obtained in the literature. 

In econometric and statistical literature, the bivariate kernel Q{-, •) is usually defined through 
a semi-positive definite univariate kernel /C(-) i.e., G{r,t) = /C(r — t). In what follows, we make 
several remarks regarding this special case. 

^This condition is on the oscillation of the basis functions. It is satisfied by the fourier basis functions 
{V2cos(27rj<), y2sin(27rji)}°^i 

^ Given a semi-positive definite kernel Qi,{r,t), its eigencompoents can be obtained by solving a 
homogenuous Fredholm integral equation of the second kind, where the solutions can be approximated 
numerically when analytical solutions are unavailable. When Q{r, t) — /C(r— t), it was shown in Knessl and 
Keller (1991) that under suitable assumptions on /C(-), A^- b = b ]C{r)dr-{Tr^fb^/2) r'^IC{r)dr + 
o{b^) and (/jj^b ~ V2am{'Kjx) for x bounded away from and 1 as 6 — )■ 0, which implies that XM.b/^i.b 1 
for any fixed M S N and 6 — > 0. 
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Remark 5.5. The assumption on the eigenvalues is satisfied by the bivariate kernel defined 
through the QS kernel and the Daniel kernel with < 6 < 1, and the Tukey-Hanning kernel with 
6=1 because these kernels are analytical on the corresponding regions and their eigenvalues 
decay exponentially fast [see Little and Reade (1984)]. Note that the assumption does not hold 
for the Bartlett kernel because the decay rate of its eigenvalues is of order 0(l/n^). For the 
demeaned Tukey-Hanning kernel with b = 1, we have that the eigenfunctions (/>i(t) = -v/2cos7rt 
and Mi) = ^ijaiyr^ eigenvalues Ai = 0.25, A2 = 0.0474, and Xj = for j > 3. It 

is not hard to construct a kernel that satisfies the conditions in Theorem 5.2. For example, 
one can consider the kernel /C(r — t) = X^^^ Aj{cos(27rjr) cos(27rjt) + sin(27rjr) sin(27rjt)} = 
Ejt!^ cos(27rj(r - t)) with E j!? = 1 and \j = 0{l/j^^+') for some e > 0. Then the 
asymptotic expansion (15) holds for the Wald statistic based on the difference kernel Q{r,t) = 
IC{r-t). 

Define the Parzen characteristic exponent 

q = max iqo'.qoe Z^, = Imi — — — — < 00 ^ . 

For the Bartlett kernel g is 1; For the Parzen and QS kernels, q is equal to 2. Let 
ci = IC{x)dx and C2 = K?{x)dx. We summarize the first and second order approxi- 
mations for the distribution of studentized sample mean in the Gaussian location model based 
on both fixed-6 and small-6 asymptotics in Table 1 below. The formulae for the second order 
approximation under the small-6 asymptotics is from Velasco and Robinson (2001). 



Table 1: Asymptotic comparison between the first and second order approximations 



based on fixed-6 and small-6 asymptotics. 



Asymptotics 


First order 




Second order 


Fixed-6 
Sniall-6 


P 




P 

Gi(a;) + (c2G'/(x)a;2- 


\ Q{b) - 





Note: Q{h) = X^jS '^j,bVj, where {Xj.t} are the eigenvalues of the kernel /C((r — t)/b). 



Remark 5.6. A few remarks are in order regarding Table 1. First of all, it is worth noting that 
< = Gi{x) + {c2G'l{x)x^ - ciG[{x)x)b + 0{b^) as 6 ^ in Sun et al. (2008), which 
suggests that the fixed-6 limiting distribution captures the first two terms in the higher order 
asymptotic expansion under the small-6 asymptotics and thus provides a better approximation 
than the Xi approximation. Secondly, it is interesting to compare the second order asymptotic 
expansions under the fixed-6 asymptotics and small-6 asymptotics. We show in Proportion 5.3 
that the high order expansion under fixed-6 asymptotics is consistent with the corresponding 
high order expansion under small-6 asymptotics as 6 approaches zero. 

Because our fixed-6 expansion is established under the assumption that the eigenfunctions 
have mean zero, we shall consider the Wald statistic Ft{oo) based on the demeaned kernel 
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Gb{r, t) = ICb{r - t) - /J )Cb{s - t)ds - ICb{r - p)dp + /J ICb{s - p)dsdp for b G (0, 1]. Let 
{4>j,b} and {Aj^t} be the corresponding eigenfunctions and eigenvalues of Gb{--, •)• 

Proposition 5.3. Suppose /C(-) : M — t- [0, l] is symmetric, semi-positive definite, piecewise 
smooth with /C(0) = 1 and xlC{x)dx < oo. The Parzen characteristic exponent of IC is no 
less than one. Further assume that 



sup 
fesN 



^ Xi,b{var{iifi) - cj^ 



i=l 



' +00 



O ~Xi,b{yar{ii,b) -a^)], as b + l/{bT) ^ 0, (16) 



vi=l 



where ^j^^ is defined by replacing (f)j with 0^ in the definition of S^i. Then under the assumption 
that cr^ > and 'Ylit.=~oo ^'^\lx{h)\ < oo, we have 



^T,b{x) = G'Ax)x^ + 0(1)) + o(i/r), 

for fixed x € M, as 6 — t- and bT — t- +oo. 

In proposition 5.3, the condition (16) is not primitive and it requires that the bias for the 
LRV estimators based on the kernel Gk,b{r,t) = Yli=i ^j,b4'j,b{f)ct>j,b{t) is at the same or smaller 
order of the bias for the LRV estimator based on Qbir, t). This condition simplifies our technical 
arguments and it can be verified through a case-by-case study. As shown in proposition 5.3, 
the fixed-6 expansion is consistent with the small-6 expansion as b approaches zero and it is 
expected to be more accurate in terms of approximating the finite sample distribution when b is 
relatively large. Overall speaking, the above result suggests that the fixed-6 expansion provides 
a good approximation to the finite sample distribution which holds for a broad range of b. 



6 Gaussian Dependent Bootstrap 

Given the higher order expansions presented in Section 5, it seems natural to investigate 
if bootstrap can help to improve the first order approximation. To present the idea, we again 
limit our attention to the univariate Gaussian location model. Consider a consistent estimate 
of the covariance matrix of {Xt}f^^ which takes the form 'E{uj;l) E M^^'^ with the (z,j)th 
element given by uji{i — j)^xi\i ~ j\) for i,j = 1,2, ... ,T, where w is a kernel function with 
iOi{x) = io{x/l) and Jx{h) = ^ Ylt^^i^i-^t — ^T){Xt+h — ^t) for /i = 0, 1,2, . . . ,T— 1. Estimating 
the covariance matrix of a stationary time series has been investigated by a few researchers. See 
Wu and Pourahmadi (2009) for the use of a banded sample covariance matrix and McMurry 
and Politis (2011) for a tapered version of the sample covariance matrix. In what follows, we 
shall consider the Bartlett kernel, i.e., uj{x) = (1 — |x|)I{|2;| < 1}, which guarantees to yield a 
semi-positive definite estimates, i.e., E{lj;1) > 0. 

We now introduce a simple bootstrap procedure which can be shown to be second order 
correct. Suppose X^, . . . , is the bootstrap sample generated from A^(0, H(/C, /)). It is easy to 
see that X*^s are stationary and Gaussian conditional on the data. This is why we name this 
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bootstrap method "Gaussian Dependent Bootstrap". There is a large Hterature on bootstrap 
for time series; see Lahiri (2003) for a review. However, most of the existing bootstrap methods 
do not dehver a conditionaUy normally distributed bootstrap sample. Since our higher order 
results are obtained under the Gaussian assumption, we need to generate Gaussian bootstrap 
sample in order for our expansion results to be useful. 

Denote the bootstrapped subsampling t-statistic obtained by replacing {Xi — fiQ,X2 — 
Ho,...,Xt — /-fo) with {Xl,X2, ■ ■ ■ ,X^). Define the bootstrapped projection vectors .^q = 
■jf EJ=i and C = ^ EJ=i <t>'i{j/T)X* for f = 1, . . . , +00. Let P* be the bootstrap proba- 
bility measure conditional on the data. The following theorems state the second order accuracy 
of the Gaussian dependent bootstrap in the univariate Gaussian location model. 

Theorem 6.1. For the Gaussian location model, under the same conditions in Theorem 5.1 
and 1/1 + l^/T — )• 0, we have 

sup |P(|T^| <x)-P*(|r^| <x)| = Op(l/T). (17) 

a;e[0,+oo) 

Remark 6.1. When K grows slowly with the sample size, the higher order expansions depend 
on the second order properties only through the quantities ^h=-oo \h\''lx{h) with A; = 0, 1,2 
for the subsampling t-statistic and the Wald statistic based on the series variance estimator 
[see Proposition 5.2 and Theorem 4 of Sun (2011a)]. It suggests that the Gaussian dependent 
bootstrap also preserves the second order accuracy under the increasing-domain asymptotics 
provided that X^^^o^ \h\'^jx{h) < 00. A rigorous proof is omitted due to space limitation. 

Theorem 6.2. For the Gaussian location model, under the assumptions in Theorem 5.2 and 
that l/l + l^/T 0, we have 

sup |P(Fr(oo) <x)- P*(F^(oo) < x)| = Op{l/T), (18) 
xe[o,+oo) 

where F^{oo) = with {Aj}+^ given in (13). Note that F^{oo) = {HY/D'^, where 

t)*^ = ^^^=1 G{i/T,j/T){X* - X:^){X* - X^) and X^ is the bootstrap sample mean. 

The bootstrap-based autocorrelation robust testing procedures have been well studied in 
both econometric and statistical literatures under the increasing-smoothing asymptotics. In the 
statistical literature, Lahiri (1996) showed that for the studentized M-estimator, the ERP of the 
moving block bootstrap (MBB)-based testing procedure is of order Op(r~^/^) which provides 
an asymptotic refinement to the normal approximation. Under the framework of the smooth 
function model, Gotze and Kiinsch (1996) showed that the ERP for the MBB-based test is of 
order Op(T~^/^'^'') for any e > when the HAG estimator is constructed using the truncated 
kernel. Note that in the latter paper, the HAG estimator used in the studentized bootstrap 
statistic needs to take a different form from the original HAG estimator to achieve the higher 
order accuracy. Also see Lahiri (2007) for a recent contribution. In the econometric literature, 
the Edgeworth analysis for the block bootstrap has been conducted by Hall and Horowitz (1996), 
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Andrews (2002) and Inoue and Shintani (2006), among others, in the GMM framework. Within 
the increasing-smoothing asymptotic framework, it is still unknown whether the bootstrap can 
achieve an ERP of Op(l/T) when a HAG covariance matrix estimator is used for studentization 
[see Hardle, Horowitz and Kreiss (2003)]. ^ 

Within the fixed-smoothing asymptotic framework, Jansson (2004) established that the error 
of the fixed-6 approximation is of order 0(log(T)/T) for the Gaussian location model and the 
case 6=1, which was further refined by Sun et al. (2008) by dropping the log(T) term. In the 
non-Gaussian setting, Gongalves and Vogelsang (2011) showed that the fixed-6 approximation 
has an ERP of order o(T~^/^"'"^) for any e > when all moments exist. The latter authors 
further showed that the moving block bootstrap (with iid bootstrap as a special case) is able 
to replicate the fixed-6 limiting distribution and thus provides more accurate approximation 
than the normal approximation. However, because the exact form of the leading error term 
was not obtained in their studies, their results seem not directly applicable to show the higher 
order accuracy of bootstrap under the fixed-6 asymptotics. Using the asymptotic expansion 
results developed in Section 5, we show that the Gaussian dependent bootstrap can achieve 
an ERP of order Op{l/T) under the Gaussian assumption. This appears to be the first result 
that shows the higher order accuracy of bootstrap under the fixed-smoothing asymptotics. Our 
result also provides a positive answer to the open question mentioned in Hardle, Horowitz and 
Kreiss (2003) that whether the bootstrap can achieve an ERP of Op[\/T) in the dependence 
case when a HAG covariance matrix estimator is used for studentization^. 

In the following, we conduct a small simulation study to compare and contrast the finite 
sample performance of the small-6 approximation, fixed-f) approximation, MBB, Gaussian de- 
pendent bootstrap (GDB), and the Edgeworth approximation derived by Velasco and Robinson 
(2001). Following the setup in Gongalves and Vogelsang (2011), we consider the AR(1) model, 

yt = pyt-i + ^l-p^eu t = l,2,...,T, (19) 

with {et} being a sequence of iid A^(0, 1) or t(3) random variables. Gonsider the Wald statistic 
based on the HAG estimator with the Bartlett kernel and QS kernel for testing the null hy- 
pothesis E[yt] = versus the alternative that E[yt] ^ at 5% nominal level. Throughout the 
simulation we set T = 50 and the number of Monte Garlo replications to be 1000. The bootstrap 
tests are based on 1000 replications for each sample. We implement the Edgeworth approxima- 
tion in two ways (feasible and infeasible) as described in Gongalves and Vogelsang (2011). The 
simulation results for b = 0.04, 0.06, 0.08, 0.1, 0.2, . . . , 1 and p = —0.7, 0, 0.5, 0.9 are summarized 
in Figures 2-3. We present the results for GDB with / = 5, 10 and MBB with block size equal 
to 5. It is seen from the figures that the GDB is more accurate than the small-5 asymptotic 
approximation in most cases and improvement is often substantial especially for large b. In the 

^Note that Hall and Horowitz (1996) and Andrews (2002) obtained the Op{l/T) results but they 
assumed the uncorrelatedness of the moment conditions after finite lags. 

®It is worth noting that our result is established under the fixed-smoothing asymptotics. It seems 
that in general the ERP of order Op{l/T) cannot be achieved under the increasing-domain asymptotics. 
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dependent cases (e.g., p = —0.7,0.5 and 0.9), the GDB tends to provide a refinement over the 
fixed-6 approximation for a proper bandwidth which is consistent with our theoretical findings. 
The improvement is apparent when the dependence is strong and b is small. In addition it is 
interesting to note that the GDB not only provides an improvement when the innovations are 
Gaussian but also in the case of t(3) distributed fat tailed innovations. The performance of 
GDB and MBB is in general quite close to each other. GDB tends to outperform MBB in the 
case of negative dependence whereas MBB delivers slightly better size in most cases when the 
dependence is positive. Finally, note that the performance of the feasible and infeasible Edge- 
worth approximation is similar to what has been described in Gongalves and Vogelsang (2011) 
for the one-sided t test. Overall, the simulation results are consistent with those in Gongalves 
and Vogelsang (2011), and they demonstrate the effectiveness of the proposed Gaussian depen- 
dent bootstrap in both Gaussian and non-Gaussian settings. The moving block bootstrap is 
expected to be second order accurate, as seen from its empirical performance, but a rigorous 
theoretical justification seems very difficult. 

7 Conclusion 

In this paper, we propose a general class of estimators to estimate the asymptotic covariance 
matrix of the GMM estimator in stationary time series models. Our proposal unifies a few 
existing covariance matrix estimators and reveals the connection among some recently developed 
fixed-smoothing approaches. First order asymptotic distribution of the Wald statistics with the 
general LRV estimator is obtained under the fixed-smoothing asymptotics. Under the framework 
of the Gaussian location model, we derive the Edgeworth expansion of the subsampling based t- 
statistic and the Wald statistic with the HAG estimator. Our work differs from the existing ones 
in two important aspects: (i) the expansion is derived under the fixed-smoothing asymptotics 
and the ERP of order 0(1/T) is shown for a broad class of fixed-smoothing inference procedures; 
(ii) We obtain an explicit form for the leading error term, which is unavailable in the literature. 
An in-depth analysis of the behavior of the leading error term when the smoothing parameter 
grows with sample size (i.e., — )■ oo in the subsampling t-statistic or 6 — )• in the Wald statistic 
with the HAG estimator) shows the consistency of our results with the expansion results under 
the increasing-smoothing asymptotics. Building on these expansions, we further propose a new 
bootstrap method, the Gaussian dependent bootstrap, which provides a higher order correction 
than the first order fixed-smoothing approximation. Simulations results strongly suggest the 
relevance of our theory and the effectiveness of the Gaussian dependent bootstrap. 

We mention a few directions that are worthy of future research. Firstly, it would be interest- 
ing to relax the Gaussian assumption in all the expansions we obtained in the paper. For non- 
Gaussian time series, Edgeworth expansions have been obtained by Gotze and Kunsch (1996), 
Lahiri (2007, 2010), among others, for studentized statistics of a smooth function model under 
weak dependence assumption, but their results were derived under the increasing-smoothing 
asymptotics. For the location model and studentized sample mean, we conjecture that under 
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the fixed-smoothing asymptotics, the leading error term in the expansion of its distribution 
function involves the third and fourth order cumulants, which reflects the non-Gaussianness, 
and the order of the leading error term is 0(T^^/^) instead of 0{T~^). Secondly, we expect that 
our expansion results will be useful in the optimal choice of the smoothing parameter, the kernel 
and its corresponding eigenvalues and eigenfunctions, for a given loss function. The optimal 
choice of the smoothing parameter has been addressed in Sun et al. (2008) using the expansion 
derived under the increasing-smoothing asymptotics. As the finite sample distribution is better 
approximated by the corresponding fixed-smoothing based approximations at either first or sec- 
ond order than its increasing-smoothing counterparts, the fixed-smoothing asymptotic theory 
proves to be more relevant in terms of explaining the finite sample results [see Gongalves and 
Vogelsang (2011)]. Therefore, it might be worth reconsidering the choice of the optimal smooth- 
ing parameter under the fixed-smoothing asymptotics. Thirdly, we restrict our attention to the 
Gaussian location model when deriving the higher order expansions. It would be interesting 
to extend the results to the general GMM setting. A recent attempt by Sun (2010) for the 
HAG based inference seems to suggest this is feasible. Finally, the second order correctness of 
the moving block bootstrap for studentized sample mean, although suggested by the simulation 
results, is still an open but challenging topic for future research. 
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8 Appendix 



8.1 Proof of the main results in section 4 



Proof of Theorem 4-1- Define StiOx) = y X]i=i ''^i- Using the arguments at p. 5 of Sun (2011b), we can 
sliow tliat 

-, [Tr] 

where A is invertible such that AA' = R{eo){G'^^WoGo)-^G'QWonWoGo{G'QWoGo)'^ R{eo)' and Wp{r) is 
a p-dimensional vector of independent Brownian motions. Using summation by parts, we get 

T-l 

bT 



1 



,{t/{bT))]-<i>,{{t + l)/{bT)) 



l/bT 



VTSt{0T) + VTM^/b)ST{OT), 



where the last term disappears by recalhng the fact that GTiOTYWrgridT) — 0. By tlic continuous 
mapping theorem, we have 



Vk 

\VTrieT)J 
Here we arc using the fact that 



/-Tlo'l>'ii^/b)Bp{r)dr\ 



Uo^Kir/b)B,ir)dr 



J 



fAj^Mr/b)dWpir)\ 

Aj^4>Kir/b)dWp{r) 
\ AW^p(l) J 



-T f <t>'s{r/b)Bp{r)dr=K[ 4>s{r /b)dBp{r) ^ K ( {Mr/b)-[Mr/b)dr}dWpir) 
" Jo Jo Jo Jo 

=A / 4>s{r/b)dWp{r), 
Jo 

for 1 < s < i^T. It is not hard to see that 



and 



Covl I (l)s{r/b)dWp{r), j dWp{r)]^Q 



Gov ( / ~4>,{r/b)dWp{r), / 4>t{r /b)dWp{r) ] = R^Jp, 



•R 0' 
. 1. 



for 1 < s, t < A', which imphes 

V = {Vl,V2,...,VK,VTr{§T)'y N{0,R®AA'), where R 

We thus get V* = {L® Ip)V -^'^ N{0, LRU ® AA') ^'^ N{0, Ik ® AA'). In other words, V* is free of the 
effect of the basis functions asymptoticahy. Recall that Dt = '}2s=i ^sV*V* , it is not hard to see that 

Ft{K) = (A-iVTr(0T))'{A-'^T(A-i)'}-i(A-iVTr(^T))/p-^'' U^D^^Up/p, 

where Dp = J2f=i "^jVjV'j and and Up are iid with distribution N{0,lp). When Xj = 1/K,j = 

1,2, . . . , K, it is straightforward to see that Ft{K) — k-^+i ^p^k-p+i- 
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From the proof of Theorem 4.1, we see that the choice of the normahzation matrix R is somehow 
artificial and the hmiting distribution is pivotal, for which the critical value is readily available. By 
introducing the orthogonalization matrix i?, the orthonormal and zero mean assumptions for the basis 
functions can be dropped, compare Sun (2011b). 

Proof of Theorem 4.2. Notice that VTr{9T) -^'^ N{c, AA') under the local alternatives. The result 
follows from the arguments in Theorem 4.1 and Theorem 5.2.2 in Anderson (2003). <) 



8.2 Proof of the main results in section 5.1 

Consider the K +1 dimensional multivariate normal density function which takes the form f{y, E) = 
(2^)-^|I]|-5exp(-i2/'E-iy). We assume the (i,j)th element and the (j, «)th element of E are func- 
tionally unrelated. The results can be extended to the case where symmetric matrix elements are 
considered functionally equal [see e.g., McCuUoch (1982)]. In the following, we use ® to denote the 
Kroneckcr product in matrix algebra and use vec to denote the operator that transforms a matrix into 
a column vector by stacking the columns of the matrix one underneath the other. For a vector ?/ G K'^^ 
whose elements are differential functions of a vector x £ R*^'^^, we define ^ to he a, k x I matrix with 
the (i, j)th element being The notation ux v represents u ~ 0{v) and v = 0(u). 



Lemma 8.1. 



Proof. Note that 



(y, E) = ^%^{(E-iy) ® (E-^y) - «ec(E-i)}. 



dveciX) 



iv.^) =(2^)-"^ cxp f-iy'E-y) 1^ + l^l-TT^^exp 



5vec(E)^^'^ ' ' [ 2" V 9vec(E) ' ' ' 9vec(E) ^ V 2 

= (2^)-^ I - i|Er^ exp (^-iy'E-iy^ vec(E-i) 

+ i|Er^ exp (^-iy'E-iy^ (E-^y) ® (E-^y) 

=l^{(^-^y) ® (E-iy) - vcc(E-i)}, 

where we have used the formulas = -X'^b ® {X-^)'a and = m|Xr-ig^gL ^ 

m|X|™vec((X-i)') [see Theorem 4.3 and Theorem 4.19 in Turkington (2005)]. 

Lemma 8.2. 



a?;ec(E)wec(E)'"' ' 4 

1 

~ 2 



(y, E) =-{(E-iy) ® (E'^y) - vee{Yr^)\{[Yr\) ® (E'^y) - vecXYT^)}' f[y, E) 
{(E-iyy'E-i) ® YT^ + YT^ ® {YT^yy'Yr^) - YT^ ® Y-'}fiy, E). 



Proof. From Lemma 8.1, we have 

d^f , d //(y,E) 



avec(E)vcc(E) 



(2/,S) = 



9vec(E) V 


2 ^' 


( ' 




Vavec(E) 


./(?/, S)^ 


/(y,s) 





{(E-iy)®(E-iy)-vec(E-i)}' 
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Again from Lemma 8.1, it is not hard to see that 



In view of Lemma 4.3 in Turkington (2005), we have 

Leg) =^^(^^ ^ + 9vec(E) ^ '^-^ ^ ^ 

Also by Theorem 4.3 in Turkington (2005), we get 

avec(E-iy) ^ , ^ , avec(y'S-i) ^ , ^ , 



which implies that 

5vec(S-iyy'S-i) 



9vec(S) 



Further by Theorem 4.2 in Turkington (2005), we obtain ^^^^^^y,) ^ ~ ^ ® S ^. The conclusion thus 
follows directly from the above derivation. (} 

Lemma 8.3. Let {Et} C R('^+^)^('^+^) be a sequence of positive definite matrices with K + 1 < T. If 
K is fixed with respect to T and ||St ~ 5]||2 = 0(1/T) for a positive definite matrix S, then we have 

||S^i-E-i||2 = 0(l/T). 

Proof. Let St = S + i?T with \\Rt\\2 = 0{l/T). For sufficiently large T, we have WY.-^Rrh < 
||I]~^||2||i?T||2 < 1- By the last equation at p. 355 of Horn and Johnson (1986), we have 



ST^-S-|b<f^l|p^ = 0(l/T). 



Lemma 8.4. Let Y,T{y) be a {K + 1) X (A' + 1) positive symmetric matrix which depends on y E M.^'^^^ . 
Assume that swpy^^K+i W^Tiy) ~ S||2 < W^t ~ = 0{1/T) for a positive definite matrix S. Let 
Rt ~ St — S. // K is fixed with respect to T , we have 



'"ec(RT)'-R 7^. 7^(y' S(y))«ec(i?T) 

Ovec{lj)vec(lj) 



dy = 0(1/t2). 



Proof. Let Rriy) = triy) - S. Note that sn^y^^K+i ||E-ii?T(y)||2 < ||S-i||2 sup,ygRK-+i ||i?T(y)||2 < 
||S^^||2||St — SII2 < I, for large enough T. By using the same arguments in Lemma 8.3, we have 
supj,gjj/f+i ||E^"'^(2/) — S^-'^||2 = 0(1/T). Therefore, when T is sufficiently large, we have y'{T,7p^{y) — 
S-V2)y = y'{t-\y) - S-i)y + y'Y-^y/2 > (A„,in(E-i)/2 - \\t~^{y) - ^-'MM^ > for aU y, 
where Aniin(S~^) denotes the smallest eigenvalue of S~^. On the other hand, for sufficiently large 
T, we have supj^gRK-+i |ST(y)r^ = sup^gRK+i iSy^y)! < sup^gRK+i ||ET^(y)||f"^^ < (||S-i||2 + 
supj,gjj/f+i ||Sj^^(?/) — E^^||2)^^^^ < C|E|~^ with C > 0. Combining the above arguments, we get 
fiy.^Hy)) < C|E|-i/2cxp(-2/'E-iy/4) < C/(y,2S) for aU y. When I{ is fixed, || • lb and || • |U 
are equivalent, which implies supj^ggR+i | |ST(y)^^ — Y^^Woo = 0{1/T). Since the elements of Y,^^{y) are 
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uniformly bounded for all y, in view of Lemma 8.2, it is straightforward to see 

5V 



vec(i?T)'- 



-(y,ST(2/))vec(i?T) 



<Cp{y)f{y,2^)/T\ 



avec(I])vec(E) 

where p{y) is a polynomial of degree 4. The conclusion follows by noting that J p{y)f{y, 2T,)dy < oo. <C> 

Proof of Theorem 5. 1 . For the convenience of our presentation, we ignore the functional symmetry of 
the covariance matrix in the proof. With some proper modifications, we can extend the results to the 
case where the functional symmetry is taken into consideration, let \Qi \ = 1^72 1 = • • • = \Qk\ = ^Z- Define 
— y^ii^i ^ A^o)- and Y — Y^f=i ^'^^ = TT^ ^ the sample mean and sample 

variance of {Y,}f^^ respectively. Note that TxiY) = ^/KY/Sy, where Y = (Yi,l2, . . . ,1^)'. Simple 
algebra yields that 



9-1 



a,, :=Cov(y„r,)= J2 

h=l-q 



q-\h\ 



lx{h- {j -i)q). 



Notice that Y follows a normal distribution with mean zero and covariance matrix St, where St 
i'^ij)i^j=i- The density function of Y is given by. 



/(y,I]T) = (2^)-^/2|I]Tr^/'cxp f- V^^iy 



Under the assumption J2h=-oo ^'^llxWl < oo, it is straightforward to sec that ||I]t — cr^/if ||2 = 0(1/T). 
Taking a Taylor expansion of /(j/, St) around elements of the matrix (t^Ik, wc have 

/(y, St) =/(y, <j'Ik) + [g^^iV' '^'^^)} ^cc(St - a'lx) 



+ vec(ST -CT^/if)' 



9vec(S)vec(S) 



(2/1 ST(y))vcc(ST - <J^Ik), 



where supj,gjjK ||St(2/) — cr^/A'lb < ||St — cr^/if||2 = 0(l/r). By Lemma 8.1 and Lemma 8.4, we get 
^^(y,a^/K) ^f{y,a'lK) {-^vec(/,,) + ® y 



and 



which imply that 



vec(ST - cr^/if )' 



d'f 



9vec(S)vec(S) 



(y,ST(2/))vec(ST-a2/if) 



dy^O 



rp2 I ' 



(20) 



( K ^ K K 

fiy, St) = /(y, a^/^,) 1 - ^ " ^') + ^/(y, ^'^k) E - ^'%) 

I j=l J i=l j = l 



g{y,<7^lK) + R{y), 



where g denotes the major term, i?(y) is the remainder term and 5ij = I{i = j} is the kronccker's delta. 
Define ^k(.x) = cr^/if )d?/. By (20), we see that 



sup 



{\TK{y)\>x} 



f{y,^T)dy-^K{x) 



< 



\Riy)\dy = 0(1/T2 
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It follows from some simple calculation that 



^K{x) = l^l-^Y^^{au-a^)^P{\tK-i\>x) + ^iJ, + J2) 



where 



K 



Ji = Y^^Tu - a^)E[I{\fK{v)\ > x}vfl J2 = J2'^^jE[I{\fKiv)\ > x}v,v,]. 



1=1 



Here {wij^i are iid standard normal random variables and Tk{v) = yKv/Sy is the t statistic based 
on {Vi} with V = ^ Ylf^i v^ and 5*2 = Ylf^M " ^f- Let U = Kv'^ and D = {K - 1)5,2. Then 
U ~ xi: D ^ xir-i; and U and D are independent. We define that 

E[l{\fK{v)\ > xK] =lE[I{\fK{v)\ > x}J2^n 



SE[I{\fK{v)\ > x}U] + ^E{l{\fK{v)\ > x}D] 



4- 



UGk-1 



(^) 



1 



:E 



D~DGi 



Dx' 
K - 1 



and 



E[l{\fK{v)\ > x}v,v,] ^^^^^l—^^E[l{\fK{v)\ > x}J2v,v,] 

= -L-E[I{\fK{v)\ > x}U] j^^J_^f in\fKiv)\ > 4E^'?] 
'{K - 1)U 



4- 



UGk-1 



K{K - 1) 



D-DGi 



Dx^ 
K - 1 



Wc then have 

P{\TK\>x)^^K{x)+0{l/T^)^{l-a}P{\tK-i\>x)+pE 



UGk-1 



{K-l)U 



T IK -l-E 



DGi 



Dx' 



(21) 



uniformly for x G M, where the coefficients are given by 

K^B 



0(1/t2 



1=1 j=i 



and 



1 ^ 

T-5 5](^.,-^') 



1 



iK+l)B 
2a2T 



2<t2t 



C'(1/T2). 



The conclusion thus follows from equation (21). 
Proof of Proposition 5. 1 . Note first that 



T{x,K)/K ^ -KP{\tK-i\ < x) + ^^E 







Xk-iGi 



Xk-1 2 

X 

K - 1 



0{l/K). 
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Using the fact that P{\tK-i\ < x) = Gi{x^) + j^x'^G'{{x^) + 0{l/K^), we get 



T{x,K)/K = ~ KGi{x'^ 



^ x^G'l[x-) + 'i±^E 



K - 1 



K -I ' K 

,2 



xI-Ag,{x^) 



- — - - 1 I X Gi(x ) + - ( - 1 I a; Gi (x ) 



0(1/X) 



Proof of Proposition 5.2. RecaU that q = T/K is assumed to be an integer. Using the notation in the 
proof of Theorem 5.1, let = Ej=i(^* - = 7f^{Ej=i ~ K{Y)^}. Notice that 



cov(y) 



fa^-B/q B/{2q) 
B/{2q) a^-B/q B/{2q) 

\ 

0{l/q^)lKl' 



\ 


B/{2q) a^-B/qJ 



KxK 



K 



2r B 

'-^ Ik + TT 
2q 



B 



(-2 1 
1 -2 1 



y 




-2/ 



K 



KxK 



^a^lK + —M + 0{l/q^)lKl'K, 
Zq 

where = (1, 1, . . . , l)ixif and the summation of all the 0(l/q^) is of order 0{K/q^). Because 



h=l-q 



q-\h\ 



jxih)^a^ -B/q + Oil/q^), 



and 



we obtain 



K 



■{^2 - B/q - a'^/K + o(l/T)} - = -B/q + 0(l/r). 



X - 1 

Consider the covariance matrix of Y' = (Yi — Y ,¥2 — Y , . . . ,Yk — Y). It is easy to see that Y ~ 
{Ik — IkI'k / I^)Y — HkY, where Hk = Ik — Ik^'k/^^ is idcmpotent matrix. Ignoring the 0{l/q^) 
order term in cov(y), we have 



(c«,)5=i cov(y) =HkCoy{Y)Hk « HkW'Ik + BM/{2q)}HK 



--a'HK + ^HkMHk = a^HK + ^ 
2q 2q 



1 2 
M A -IkI\< 
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where 



A 



/-2 -1 -1 

-10 

-10 

\-2 -1 -1 



Since Y is Gaussian, we get 



-2\ 

-1 

-1 

-V 



KxK 



K 



(K -1)'^ ^ (Ci^Cji + 2c^j), 



where = (l - cr^ - | + 0(l/r) and = -^a"^ + - j| = 1} + 0(l/r), for i ^ j. It imphes 

that 



and 



K 



K 



\i-j\ = l U-j\>l 



.KB'- 2{K - l)B 2 
cr H (T 



B'^ cr'^B\ (A'-l)(A'-2) 4 



4q2 



X2 



a* + 0(l/g) 



= {K - 1)^4 + 0(if/g), 



K 



cucjj =K^cl^ + 0{K/q) = (A- - 1) V 



4 2BAr(A: - i)ct2 



0{K/q). 



Therefore we get 



A' + l 



2BA'o-^ 



0(1/T), 



which imphes 



K-l {K-l)q 



A' - 1 



-0(1/T). 



Let X = (Xi,X2,...,Xt)', fLGLS = (/^cov(X)-i?t)-^Z^cov(X)-iX and a^^^s = Tvar(AGLs) = 
T(^^cov(X)~^^t)~^- Note that ficLS — f^o is independent of Sy and CTg^g = (T'^ + 0(l/r) [see Grcnandcr 
and Rosenblatt (1957)]. Using similar arguments in Lemma 1 of Sun (2011b), we have 



/ r(AGL5-M0)V ^' 
"GLS 



GLS 



<X^]+ 0{1/T) 



=E[G,{Si-x'/cj')]+0{llT) 

=G,{x') + ^G\ix^)E[S^ - a'] + ^'^^f A[(^g. - a')'] + 0{1/T) 
=Gi{x') - §^x'G[{x') + ^x^G^x') + 0{l/T). 
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8.3 Proof of the main results in section 5.2 



We first establish a high order expansion for Wald statistic based on the general LRV estimators con- 
sidered in section 3. Let ^ = ($0,6, ■■■,^k) with Co = ^ Y.'I'=ii^^ - Mo) and = X^Li 4>%ilT)X^ 
for J = 1, 2, ... , K, and be the covariance matrix of ^. We first show that the convergence rate of 
is of order 0(1/T) when the number of basis functions K is fixed. 

Lemma 8.5. Assume the basis functions {0s(i)}^i are bounded with finite discontinuous points and 



satisfy SUp„g(o,i] <{ ^ Jq " (t)s{x){(t>r{x + a) - (f)rix)}dx + ^ (j)s{x){(j)r{x - a) - (j)r{x)}dx 



1 fi I 



< 00, 



for 1 < s,r < K. Recall that 4>sit) — 0s(i) — /q 'ps{t)dt. IfYlih^-as ^'^\lx{h)\ < 00 and K is fixed, th 
we have - cr^^Hoo = 0{1/T). 

Proof of Lemma 8.5. For s = 1,2, . . . , K, we have 



en 



T T 



cov(5o, ^s) ^ ^^Yl ^^^^ ~ * 



^0 / .? 



1=1 j=i 



T-l 



T T 



l<i,h+i<T 



h=l-T 



Q f h + i 



Simple algebra gives us 



1 ^ ^,fh + ^ 



T 

l<i,h+i<T 

It implies that 



T 



^EtiM^m-^EliM^m, h>0; 



1 „1 +c« |- ft, T N 

cov(Co,6) = ^/ '/'.(t)di E |/^l7x(M-^ E 7xW E'^«(*/^)+ E [ + ^(l/r^). 

ft=-oo 0<ft<T 4=1 i=T-/i+l ^ 

(22) 

Note that the second term on the right hand side of (22) is of order 0{1/T) because the basis hmctions 
{(j)s(t)} are bounded. Consider the covariance between and S.r with 1 < s,r < K. Straightforward 
calculation yields 



T T 



cov(6,c.)=-EE<^° 



1=1 ]=1 

T-l 



.0^ 

r 



lx{i- j) 



h=l l<j,j+h<T 



T 



T 



ft=l-T l<j,j+ft<T 



,0 f j±± 

T 



T 



of ^ \ jfi I 



T 



Notice that 



l<i<T 



T 



I 1 ^0 ( 



T 



<^,(O0r(t)rft + CT(0.(t),0r(t)) =i?.r+CT(<^.(O,0r(t)), (23) 
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where CT{4>s{t) , 4>r{t)) is of order 0{1/T). It is not hard to see that 



T-l 



h=l 
T-l 



T-/i 



T 



T 



T 



say J] 



1,T 



and 



/t=l-T 

-1 



E 

j=i+i/ii 



T 



T 

E ' 



T 



T 



^0 M 
T 



^ E TxW^E'^n^ 



h=l~T 



T 



T 
\h\ 



-E^M^ 



, say J2,T 



T 



Using (23), we have 

^l.T + J2,T 



\h\>T 

T 

E 

j=T-h+l 



T-l 



h=l 



aO ( L 

T 



° ? f 



(24) 



Under the assumption that supQ,gjQ i] |^ /q " 4'rix){4>six + a) — (f)six))dx\ < oo, it is straightforward to 
see that 



Ji,t| <^ E \hlx{h)\ ^snp 



h=l 
T-l 



l<h<T 



T-h 
1 ^r- ,Q ( J 



T 



E 



/^i + h 
T 







1 sup 


-/ 


[Qe[o,i] 


a Jo 



(i)r{x){4>s{x + a) - 4>six))dx 



which imphes that Ji t = 0{\/T). The same argument apphcs to J2,t- The proof is then complete. <!) 

The assumption regarding the basis functions in Lemma 8.5 is mild. If {0j(O}jLi S'^e lipschitz 
continuous of order one, then the assumption is satisfied. It is easy to check that the condition holds for 
all three types of basis functions considered in Section 3. Recall the definition of R and L from Section 
3, and let R = diag(l, i?) = {Rij)^^j=o and L = R^'^ = (i^ Of^^o- Define 

^ K K 

*T,/f(x) = Qi,k{x) + ^ E E^-^-^('^°^(^"^j) ~ - ^)^k{v) < x}], 

where v = {vq,vi, . . . ,vk) ''^ N{0^ Ik+i), Qi,k{x) is the distribution function of Qi_k as defined in 

2 

Theorem 4.1 and Tk{v) = — The following lemma establishes the high order expansion for 

Wald statistic based on the general LRV estimators when K is fixed. 

Lemma 8.6. Suppose > 0. Under the assumptions in Lemma 8.5 and H2.0, we have 



32 



sup^g[o,+oo) \^t,k{x) - Qi,k{x)\ = 0{l/T) and 



sup \P{Ft{K) <x)~ ^t,k{x)\ = 0(1/T2), (25) 

x6[0,+oo) 



with K fixed and T — > oo. 



Proof of Lemma 8.6. It follows directly from Lemma 8.5 that su\)^^^\'^t,k{x) — Qi^k{x)\ ~ 0(1/T). 
To show the second part, we first note that under the Gaussian assumption, the density function of ^ is 
given by f{u,T.{) = (27r)"(^'+i)/2|5]^|-i/2 ^^^p . Taking a Taylor expansion of the density 

function /(u, S^) around the covariance matrix tr^i?, we get 

/(«, S^) = /(u, n^R) + ^^^(^, a2i?)vec(S5 - a^R) + Rt{u). 



By Lemma 8.4, the remainder term Rt{u) satisfies that /jjk+i \RT{u)\dv = 0(l/r^). Following Lemma 



.1, we have CT^i?) = /(m, cr^^)< 2^(i? ^u) (g) (i? ^m) — :5^vcc(i? ^) >, which implies that 



. if if . 

P{Ft{K) < x) - Qi,if (x) 1 - ^ E - '^'^^^) f + Ct(2:), 



where CT(a^-) = /{FT(«;if)<x} ^^^)(^ ^u)' ® iu)'vec(E^ - cr2^)rfu + J^^^^^.^^^^j i?T(w)rfu 
with Ft{u;K) ~ Ft{K). By letting v = Lu/cr and noting that E\i{FT{v) < x}vsVr] = for s 7^ r, we 
obtain 



Cr(a;) = -l-Emj^riv) < x}{v L)vec{^^ - a^R) + [ RT{u)du 

20" J{Ft(u-K)<x} 
K K 

^.=0s=0 J {Ft(u;K)<x} 



2(t2 

2 

where Fk{v) = ^k"°^ — 2 with w = (wq, ui, . . . , wa') being a {K + l)-dimensional vector of iid standard 
normal random variables. In view of the definition of $T,if (a;), we get 

< / \RT{u)\du = 0{l/T-'), 



sup \P{Ft{K) <x)- $T,if = sup 



RT(u)du 

{Ft{u;K)<x} 



K + l 



which completes the proof. <^ 

Lemma 8.7. Let C K(-^+^'^(-^+^) be an array of positive definite matrices with J + 1 < T. 

Assume that \\Yit^j+i — S,7+i||oo = 0{J/T) for a sequence of positive definite matrices {Sj}"^]^ with 
supj ||I]~"'"||2 < oo. // J satisfies that 1/J + ,P /T 0, then we have — I]jj[^-^||oo = 0{J^ /T). 

Proof. Let T,t.j+i = Sj+i + Rt^j+i- For sufficiently large T, we have ||I]J^j^i?T,j+i||2 < (>/ + 
l)||I]7^J|2||i?T,j+i||oo < 1, where we are using the fact that ||i?T,,/+i||2 < (-^+ 1)||^^t,j+i||oo- It follows 
that 

iis^Vi - ^TiliWoo < iis^Vi - ^TjIiu <iJ+ = oijvn 

i - ||^j+l^T,J+l||2 
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Lemma 8.8. Let i]T,J+iiy) be a {J + l) x {J + 1) positive definite matrix which depends on y £ M.'^^^ , 
and Y}t,j+i and Ej ~ <^^Ij satisfy the assumptions in Lemma 8.7. Assume that sup^^gjjj+i — 
'T^/j+illoo < ||St„/+i - <y^Ij+i\\oo = 0{J/T). Let Rt.j+1 = ^t,j+i - ^^Ij+i- If J = o{T^/^), we have 



vec{R' 



■T,J+l, 



dvec{'E)vec(T.) 



{y^^T,.j+i{y))vec{RT,.j+i) 



dy = o{l/T). 



Proof. Let RT„j+i{y) = Et,.7+i(2/) - (t^/j+i. Note first that sup^^gRj+i ||^t,j+i(2/)/ct^||2 < (J + 
1) supj^gU'^+i II-Rt,j+i(2/)||oo/ct^ < {J + 1)||St„/+i - <7^Ij+i\\oc./cr^ < 1, for large enough T. Follow- 
ing the arguments in Lemma 8.7, we know that 



sup \\J:j.j^j^{y) - <T /j+i 2 < ^7 n — 



Choose r = /T. Then we have 

y' (^t]j+M - ^i^r)a^ ^'^+^ y =y' (^T^j+iiy) - -^l 



0{J^/T). 



7+1 y 



(r + 1)0-2 



> 



,(r + 1)(72 

when T is sufficiently large. On the other hand, we have 



sup |S^;^+i(y)| < sup \\f:^lj^,{y)\[^+' ^\a2 ' T 



< 



< 



1 



1 cr-^'^' 



(r + 1)0-2 
1 



1 + r 



7 — I i\ 2 ^'+^ 
(r + 1)0-^ 



(1 + Cr) 



C(r + l)j2^2 x J+i 
T 

(l/r)(J+l)r 



< c 



1 



7 — r~rvT-^-'+i 

(r + 1)0-^ 



The above arguments imply that f{y, J^T,j+i{y)) < Cf{y, (1 + r)a'^Ijj^i) for all y. Therefore we get 

d^f 



vec(i?T,j+i)': 



-(y,5]j+i(y))vec(i?r.j+i) 



<C 



avec(S])vec(I]) 

vec(i?T.,7+i)'{(S7+i(y)y) ® (S7ii(y)y) - vec(S7ii(y))}{(S7|i(y)y) ® (S7|i(y)y) 



vec(S7i (y))}'vec(i?T.,7+i) 



/(y,(l + r)a2/,+i)'^y 



C 



vec(i?T,j+i)'{(S7ii(y)yy'l]7|i(y)) ® S7|i(y) + ^^^M ® (S7|i(y)yy'l]7|i(y)) 



- I]7|i(y) ® E7|i(y)}vec(ET,j+i) /(y, (1 + r)a'lj+i)dy < CJ'/T' = o(l/r), 

where the first inequality in the last row comes from the fact that supyg^j+i ||E7j[Lj^(y) — (J^'^Ij+i\\oo < 
sup^gR.+i \\tjl,{y) - a-^Ij+ih = 0{jyT). 



Lemma 8.9. Recall from Section 5.2 that Qi^j{x) is the distribution function of Qi^j for J 
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1,2,..., +0O. Then we have 



sup |Qi,j(a;) - Qi,oo(a;)| = O ^ 
xe[o,+oo) \j=J+i 



(26) 



Proof. Let [/(J) = 2;^^^ Aji;2 and F(J) = E^j+i t^cn gi,j = Wo/t^(-^)- For any x G [0, +oo) 

and large enough J with J > 3, we have. 



\Qi,j{x) - Qi,oo(.t)| =|-B[£;[i{gi„/ < x}\u{j)]] - £;[i;[i{Qi,oo < 4lc^(oo): 

^\E[G,{xU{J))]^E[G,{xU{(^m 
^\E[G,{xU{J) + xV{J))] - E[GiixU{J))]\ 

■V{J) 



= \E[xV{J)G'^{xU{J))]\^ 

■v{jy 



E 



<CE 



U{J) 



< CE[V{J)]E 



U{J) 
1 



xU(J)G\{xU(J)) 



U{J) 



< 



where [/ (J) < f/( J) < ?7 (J) + J) and C does not depend on x. Note that we are using the mean value 
theorem, and the facts that E[1/U{J)] < E [l/(A3X3)] < °° ^^'^ ^^Pxm < oo. <) 

Lemma S.IO. Let VriJ) = J2^j+i ^j^j ■ Assume that supi<j<oo sup(g[o,i] '/'^(O < and {Xt} is a 
stationary Gaussian time series. Then we have EV^{J) = 0{(^'^j^^ -^j)^)- 

Proof. Let (7^ = jx{i — ])■ For «, j > J + 1, we have 

T T 



n ,22 = 1 J1,J2 = 1 

T T 



^ E E 4>°i^hlT)<j)^i{i2lT)(j)^^{ji/T)(fPj{i2/T){^^^ 



y2 

=Il,T + -^2,T + ^3,T- 

For the first term, we have 



^1.^= E 0°(n/r)0°fe/TK,J E -^5 

\ il,i2 = l / \ Jlj2 = l 



Note that 



\Li: 



T-l 



/l — — CXD 



i<n<T 



^ E E 0°(*i/T)0°(W7^W 

/i^l-T l<zi,?i+/i<T 

<c E l^^e^)!' 

which implies that |/i.t| < ^'(E^^j^ |7x(/i)|)^- Similar arguments apply to the other terms 
I2.T and /3_T- We then have supj_|_i<j j<oo < G. Therefore, we obtain E[Vt{J)'^] = 
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Lemma 8.11. Assume the eigenfunctions are continuously differentiable, mean zero and uni- 
formly bounded, and -^j ^ Suppose that {Xi} is a stationary Gaussian time series with 
J2t=-oc^^hxih)\ < 00. When l/J+J/T^O, we have 



sup \P{Ft{J) <x) ~ PiFrioo) <x)\ = O 




Recall that Ft{J) ~ „ , " 2 for J = 1,2,..., 00. 



Proof. Let RriJ) = Ft(J) - Ft(oo) 



-. For any S > 0, we have 



P(^t(oo) <x~d)~ P{\Rt{J)\ >S)< P{Ft{J) < x) < P{Ft{oo) < x + S) + P{\Rt{J)\ > S). (27) 



Observe that 



^0 



1/2 



Choose a fixed Jo > 9, denote by St.Jq+i the covariance matrix of (Coi • • ■ : C/o)- -By Lemma 8.5, we 
know that ||St,Jo+i - o-^-?",/o+i||2 < (Jo + 1)||St,Jo+i - o-^-^Jo+ilU = 0{1/T). For large enough T, 
we have ||I;t,Jo+i||2 < 2ct^. Let A = min(l, > 0, we know that — A/j+i is semi-positive 

definite, i.e., for any x G R'^+^, x'S^ j^^^^^ — ^x'x. Using similar arguments in Lemma 8.3, we know 

1-1 ^ llv-l ll-'o 
1| :i II^T,Jo + lll2 



that ISt.Jo+iT^ < W^T^j +1112°^^ < (2/cr2)'^°+^ for large enough T. For any J > Jq, we have 



E 



^0 



<E 



<- 



1 



cyij){-~Xw' w /2)dw 



where w ~ {wq, wi, . . . , wj„) and Xm denotes a chi-square random variable with m degrees of freedom. 
By Lemma 8.10, we obtain 



P{\RTiJ)\>S)<c\ J2 A,) A 



(28) 



In what follows, wc show that sup^gjg \P{Ft{oo) < x ± S) — P{Ft{oo) < x)\ < C^/S for any S > 0. 
Let X = {Xi,X2,...,XtY, It = (1, 1, . . . , 1)', X* = X - It^lq and Q.t = cov(X). Then the GLS 
estimate of ^ is given by ficLS = {1'j'^^^It)~^1't^t^ ^ ^-nd Aols ^ I^q ~ f^GLS ~ f-o + ^I'^X, where 
X = (/t - It{1't^t^It)~'^1't^t'^)X* . The following facts which can be found in Sun et al.(2008) play 
an important role in the proof presented below: (1) ficLS ~ /^o is independent of X; (2) jicLS ~ /^o is 
independent of X - hfiOLS- Notice that Dt = Z^jli ^j^^ = T'i^ ~ iTtiOLs)' Qt{X - hfj-OLs) with 
Gt = iGi'i/T,j Then fiQLS — Mo is also independent of Dt- Define ctq^^s ~ Tva,r{fiaLs) ~ 



T {I'rpVlj} It) ^. Denote by '^norm and 



the cumulative distribution function and density function 
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of the standard normal distribution. Therefore, we get 

^{f^OLS - ^J■o) 



P(Ft(oo) < x) =2P 



'Dt 



<^/^\-l = 2P[ VTifioLS - Mo) < yxDr - 1 



--2P ( VfifiGLS - f^o)/<JGLS < ^xDt/(tgls - I't^ / (VfaGLs) j - 1 



--2E 



xbr/o-GLS - I'tX / {VTaGLs) ] 



which imphes that for x, (5 > with x — S > 0, 
|F(Ft(oo) <x±S)- P{Ft{oo) <x)\ 



2E 



\x ± 5)DT/<yGLS - I't^/ ( VTctgls^ 



2E 



xDt/<Jgls - IrX/iVTa. 



GLS. 



DT(t>norni{a* - I't^ / (vTaGLs)) /<^GLS 



<cV5E[^/aGLs] < cV5{E[DT]y/y<jGLS < cVs, 



(29) 



with VxDt/ctgls < a* < [x + 5)Dt / ctgls or y {x - 5)bT / ctgls < a* < \J xDtIogls- Here we 
are using the fact that ct^gls = cr^ + 0(l/r) and E\Dt\ is uniformly bounded for all T. Choosing 
^ = (E^lj+i ^J)^^^ the conclusion follows in view of (27), (28) and (29). <> 



Lemma 8.12. Under the assumptions in Theorem 5.2, we have — fT^/j-)-i||oc 

J <T, where denotes the covariance matrix of (Coj^ij • ■ • lO)- 

Proof of Lemma 8.12. Using the arguments in Lemma 8.5, we have for any \ < s < J, 



0{J/T) with 



|cov(eo,Cs)| <C 



±Y.c|>s{^/T) +^ 7x(/^)|E'^^(V^)+ ^ M^/T)] 

i=l 0<h<T ^ 1=1 i=T-ft + l ■' 



<C/T, 



where (7 is a generic constant which does not depend on s. Again by the arguments in Lemma 8.5, we 
have 



T-l 



\h\>T 



h=l~T 



1 



T-l 



Y lx{h)\Y4>'r 



I 3 \ j,0 ( J 



T 



h=l ^ 3 = 1 

\Ji,t\ + \J2,tI l<s,r<J, 



T 



T 

]=T-h+l 



where Ji.t, J2,t and CT{4>s{t) , 4>r{t)) arc defined in the proof of Lemma 8.5. By the Trapezoidal rule 
and the assumption that sup]^<j<j sup^^jg ^ \4>'l(t)\ < CJ^, we have 



(30) 



which implies that |cov(^s,Ci-) ~ c^t^srl < CJ/T + \ Ji,t\ + \ J2.t\ for J <T. By the mean value theorem 
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and the assumption that sup]^<j<j sup^gjQ < CJ, we get 



T-1 



|Ji.t|<^EI7^('^)I 



h=l 
T-1 



T-h 



f j \ f j,0 f j + h 



T 



T 



( J_ 

T 



T-h 



(31) 



h=l 



T 



< 



CJ 
T ■ 



Using the same argument for J2,t, we get |cov(^s,Cr) — cr^^srl < CJ/T, which completes the proof. 

Proof of Theorem 5.2. Suppose J = o{T'^/^). By Lemma 8.12, we know +i - a'^Ij +i\\oo = 0{J/T). 
Using Lemma 8.8 and similar arguments in the proof of Lemma 8.6, we can show that 



sup |P(Ft(J) < x) - Qi.jix) - ^PJ,T{x)\ = o(l/T), 

where Vj,t(x) = ^ Ezio(var(eO - <J^)E[{vf - l)I{Tj{v) < x}] with v = {vo,vi,. ..,vj)^ N{0,lj+,). 
Next, we show that ipj{x) converges uniformly as J +oo. Note first that 

J+P 



sup \tpj+p,T{x) - tpj,T{x)\ < sup 

xe[Q.+oo) xe[Q,+oo) 



^ ^ (var(e.) - <j')E[{vf - m^j+piv) < x}] 



i=J+l 



sup 

xe[o,+oo) 



1 

^E(^a^(^') - '^')^[(«' - m{^j+piy) < ^} - H^Av) < x})] 



i=l 



+ sup 

xe[o,+co) 

-h + h + h, 



^(var(^o) - ^')E[iv^o - < x} - I{Tjiv) < x})] 



for any J,p E Z+. In view of (30) and (31), we have 

|var(e.)-'T2| <c(*/r + zVr2), 



(32) 
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for 1 < i < oo. Hence we get, for sufficiently large J, 



J+P 



^1 .i^p . E 



c 



J+P 



(varte) - a^)E 



J+P 



<- snp 5] + 



c 



J+P 



a:e[0,+oo) 

J+P 



E 



E 



J+P 

{vI-1)gAxY,^jv] 



{vl -1)\gA xY,\,v] + KvlxG',{y.,) 



J+P 



<% sup ^ + *Vr)A,i? + l)xG'i(2/,)| <% E + *Vr)A.i? 



J+P 



i=J+l 



<r E {^ + ^VT)KE[v}{vj + l)\E 



i=J+l 

+ 00 



„ f +00 +00 ^ 

% E + ^ E ^^A. Uo 



T 

,i=J+l i=J+l 



j-a+2 



(33) 



where yi = x{^-_^^ XjV^ + aiX^vf) for some < < 1. On the other hand, we get 



J 



. CJ V- 
h <^ sup 2^ 



< 



T 
CJ 



a:e[0,+oo) -^^ 
J 



^ :i^p .E 



r 

CJ 



a:6[0,+oo) -^^ 
J+P 



J+P 



(t.f - 1) <^ Gi X ^ A, t;f - Gi A,^;| 



J+P 



i=i 



/ J+P 

2 



u=J+i 



\j=J+i 



J \ ,,2 



E .1 A, 



3 = 1 "3 



< 



T 



E ^H-o 

Vi=J+i 



i=i i=J+i 

j-a+3^ 



T 



Finally using the Cauchy-Schwarz inequality and similar arguments in Lemma 8.9, we know 
h<^{E[{vl~lY]Y'^ sup {E[{l{Fj+,{v)<x}^l{Fj{v)<x}f]Y'^ 



a-e[o.+oo) 



+ 00 



1/2 



<^ sup {E[\l{Fj+,{v)<^}-n^j{^)<m"^<%\Y.^^ 



a;e[0,+oo) 



O 



yj=J 



j(-a + l)/2 

T 



Therefore, it is straightforward to see that sup^gjQ g^-j |^j_T(a;) — '0T(a;)| = O ( J' ^+1)72^21^ ^^^^ 
sup^6[o,oo) \i^T{x)\ = 0{l/T), which imply that 

sup \P{Ft{J) <x)- Qij{x) - Vt(x)| = o(l/r), 

a:G[0,oo) 

for J = o(ri/6). Let J = Ti/Vlog(r) and note that (E^j+i Aj)^/^ = "(l/T). The proof is completed 
in view of Lemma 8.9 and Lemma 8.11. <^ 

Proof of Proposition 5.3. Under the assumption that sup^j^jj |/C(x)| < 1 and j^"^ \lC{x)\dx < 00, we 
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have 



Ur,t)dr < 4 sup / \gb{r,t)\dr 



te[o,i] -^0 



+00 „i „i „i 

y"(Aj,fc)'=/ / gUr,t)drdt< sup / ^, 
^■^2 "'o "'o te[o,i]Jo 

/I— t /'+00 
|/Cfc(r)|dr < 16 / \ICb{r)\dr < Cb, 



and Ai,6 < S^Ql{r,t)drdtYl^ < CVb. Suppose {aj is a sequence of random variables such that 
< 5,i < 1. Using the fact that Yli'^=i ^j.b = Gbi^, r)dr = 1 + 0{h), we get 



+00 



+00 



supS 

By the Talyor expansion, we have 

+00 



1 I = sup <( 2 Y^iXj.bf + {KbfE{d,vl - 1)' J> + 0{h) < Cb. (34) 



V'T.fclx) =^ 5](var(^\b) - _ l)I{^^{v) < x}] + 0(1/T) 



-\-oo 



+CXD 



(vf - 1)G^ \ xJ2h 



bi'i 



i=i 



0(1/T) 



4=1 

2 +°° 



+00 



bV, 



a; 
4^ 



E(A.,6)'(var(e,,6) - a^)E 



+00 



^f{vf-l)G'l\xiY,kb^'j 



-0{1/T) 



--hT.b + hTfi + 0{\/T), 



where < < 1. Let An, = E 



G'i[x[ >'j,bv'j , Bi^b = Ai^6(var(^i^6)-cr2)^ (;7^^ ^ T,]=i Ej,b and 



•Sjv.fc = Z^ili ^i,bBi^b- Using summation by parts, we have S'Ar^i = AN,bCN,b - Y.'iLi'' i^i+i,b ~ ^4,6)^,6 • 
Note that {Ai^h}t^ is a nonincreasing sequence and hmt_yo supj Ai^b = G'i{x) as seen from (34). Let 
DT,b be defined by replacing and Xj with and Aj.b in the definition of Dt- It is not hard to see 
that as b+l/{bT) 0, 



+00 



hm AM,bCNM = ^''G'{x) I EIDtAI^^ - E I (l+o(l)) = _ ^'^^^^^ ^^^^ °° |fe|^7x(^) (^^^(^^^^^(i/y)^ 



JV— >+oo 



{bTY 



where we have used the fact E[DT,b\/^'^ - Y.'^Z hfi = -^^^%fri^P^(l + o(l)) + 0{l/T), which 
can be proved by using similar arguments in the proof of Lemma 2 in Sun et al. (2008). On the other 
hand, observe that \ Yl!iJi{^i+i.b~ A,,^b)Ci^\ < sup^g^ |Ci,6| X]^lT^(^i,& - ^i+1,6) < supjgpj |Ci,6|(^i,6 - 
limjv_i.+oo Afq^b) = o(| limAr_^+oo CN,b\) as 6 + l/{bT) — > 0, for all TV. Hence we get 



'lT,b 



xG'{x)g,Y.l=-oo\h\''lx{h) 
<j^{bT)i 



(i + o(i)) + o(i/r). 



Define Hiu — XihE 



vfivf - 1)G" ix ^3,bv] + aiXi^bvf] ] and SN,b = LiJT Hi^bBi^b- Again using 
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summation by parts, we obtain SN,b = Hj^^iiCn^ ~ 'Yl!i=i^ {-^i+i.b ^ Hij,)Ci_b- By (34), we can show that 
supj \H,^b/>^i,b - 1201(2;) I = 0{\/b). Therefore, we get hmAr^+oo CN,bHN,b = o(limN^+oa CN.b) and 



N-l 



i=l 



<snp\Q,f,\\ Vdi^.+i.b- 12A,+i,5G'/(2:)| +12G"(x)(A,.6 - A,+i,b) 
+ |12A,,bG'/(x) - H,,b\)\ = o(Vb hm CnA ■ 

J \ W-5- + 00 J 

The conclusion follows from the above arguments by noting that l2T,b — o(/it,6)- 



8.4 Proof of the main results in section 6 

Proof of Theorem 6.1. The proof is similar to those of Lemma 8.13 and Theorem 6.2. The details are 
omitted. <) 

Lemma 8.13. Let ijJi{x) = (l - \x/l\)l{\x/l\ < l}. Suppose that nv" /P + (mlf/T + 1/m and 
'^h=-oo ^'^\lxih)\ < oo. Then under the Gaussian assumption, we have 



sup 

0<A:<m 



l-l 



T-1 



X! 9k,T{h)uJi{h)jx{h) ~ ^ gk,T{h)'-fx{h) 



h=l-T 



where \gk,T{h)\ < C{k\h\ + \h\ + 1) for < k < m and \h\ < T, and the constant C does not depend on 
k and h. 

Proof of Lemma 8.13. Note first that for any e > 0, 



P sup 

\ 0<A;<m 



< 



^ gk,T{h)uji{h)^x{h) - gk,T{h)jxih) 

h=l-l h=l-T 
l-l T-1 

^ gk,T{h)uJi{h)jx(h) - ^ gk,T{h)jx{h) 



> e 



< 



< 



^ ra 
fc=0 

fe=0 



h=l-l 
l-l 



h=l-T 
T-1 



> e 



gk.T{h)uji{h)jx{h) - ^ gk,T{hhx{h) 



h=l-l 
l-l 



h=l-T 



Y 9k,T{h){uJi{h)jx{h) ~ jx(h)} 



h=l-l 



1^' 



Let Zi ^ Xi — E[Xi\ and = ZiZi_^\i^\ — ^x{h). Simple calculation yields that 



l-l 



Y gk.T{h){uji{h)'jx{h) - 7x(/i)} 



h=l-l 



l-l 



Y 9k,T{h)uji{h){jx{h) - -fx{h)} 



h=l-l 



+ C{k + l)ll 



< 



l-l ( -, T-\h\ ^ l-l 

Y gk,T{h)ioiih)\- J2 \ + E gkAh)u^iih)\^^zH 

h=l-l y 2=1 J h=l-l ^ ' 



l-l 



Y 9k,T{h)uJi{h) I - Y {zi + z,+\h\) 



h=l-l 



T-\h\ 



C{k + 1)/; := Lit + hr + hr + C{k + 



which implies that E EL\-i gk,Tih){uJi{h)^x (h) - jx (h)} < C{Elfj, + EL^j. + EL^j, + (fc + 1)^/1 
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We proceed to derive the order of El^rp. Notice that 

^ 1^1 T-|/ii| T-|h2| 

EIIt^t^ X! 5Z 51 coY{w,^\h^\,w,^\u^\)gk,T{hi)gk,T{h2)i^i{hi)uji{h2) 

hi,h2—l — l 42 — 1 

,C(fc + l)2 



<- 



2^2 



E E E i\hi\ + mh2\ + l)\lxiii'i2hx{ii-i2 + \hi\-\h2\ 



hi,h2 — l — l ii — l 22— 1 
C(fc+1)2 



< 



T2 E E E i\hl\ + mh2\ + l)\lx{^l-^2-\h2\hx{^l-^2 + \hl\ 

hi ,h2 — l — l — 1 42 — 1 



2^2 



hi, 112 = 1-1 s=l-T 

/-I T-1 



^^yV^ E E (r-kl)(|/ii| + l)(|/i2| + l)|7;f(s-|/i2|)7x(. + |/ii|)|:= J-l,T + ^2,T. 



Then we get 



Jl,T < 



< 



and 



>J2.T < 



< 





1-/ i5= 




1)2 


T 






hi 


C(fc + 




T 




C{k + 


If 


T 






hi 


C{k + 


Ifl 


T 


s 


C{k + 





+ CSO 



i\h,\ + mh2\ + i) hx{shx{s + M-\h2\ 



s— — oo 



l-l 



E E i7x(. + i/^ii-i/^2i)i< 

hi .h2 — l — l 



Cik + lfl^ 



S — — 00 



l-l +00 

Y (|/ll|+l)(|/l2| + l) E bx{shx{s+\hi\ + \h,\ 
hiJi2 = l — l s= — oo 

+ 00 l-l 

E E (|/^ll + |/^2| + l)|7x(.+ |/il| + |/.2|) 



/il,/!.2 = l- 

21-1 



; ^ E i7x(.)iE-'i7x(^+-) 



< 



C{k + lfl 
T 



It imphes that Elfj. < ^^^^^^y^-!—. Applying similar arguments to I2T and I^t, we get < C(fc + 

1)2/4/^2 g^^^ ^j2^ < (j^f^ ^ ifl^/T^, Note the constant C above does not depend on m by the 
assumption. We then have 



P sup 

\ a<k<m 



l-l 



T-1 



E 9k,T{h)u>i{h)^x{h) - Y 9k,T{h)jx{h) 



h=l-l 



h=l-T 



> e I < + (mlf/T) ^ 0. 



Proo/ of Theorem 6.2. We choose m so that m^/P + {mlf/T+l/m (e.g., / x T^/^ and m x T^/is-"^ 
for some e > 0). From equation (24) in Lemma 8.5, we know that 

var(6)-^72-(var*(en-'^')=^<^ E 9^.T{hhx{h) - ^ g.,TihUih)^x{h) \ - E ^^W' 
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where a'^ = Y.h=i-i ^i{h)jx{h) and ga^rih) = -\h\, 



h 

E 



AO 1 1 
T 



2 T 

+ E 



> 1} 



T-h 

E^n^ 



.0 (I 
T 



l{h > 1} 
l{h<-l}, 



for 1 < i < m. Note that sup^^^^^\TCT{4>i{s),4'i{t))\ !i C- It is not hard to see that \gi^T{h)\ < 
C(|i/i| + \h\ + 1) for < i < m. By Lemma 8.13, we know 



sup |var(^,) - cr^ - var*(^*) + ^ Op 

0<i<m 



\/m^/P + imlf/T' 



T 



Since the bootstrap sample is normally distributed and X^lJz-i h^^iW\jx{h) \ is bounded in probability 
in view of the fact that J2h=-oo ^'^'-^iih)E\^xih)\ < oo, Theorem 5.2 is also applicable to the bootstrap 
sample, i.e., 

sup \PiF^{^) <x)~ Qi,oo(.t) - rAx)] = Op(l/T), 

xG[0,c») 

where ip*j,[x) = E^o(var*(^*) - a-'^)E[{vf - l)l{Foo{v) < x}]. It is not hard to show that &^ - 
0-2 = Op{^l/T +1/P). Note that var*(C*) - ct^ = ^ EL=\-/ 5i,T(^)w;(/i)7x(/i), which implies that 
|var (j- }-<j I _ Op{l) [see e.g., (32)]. Using the arguments in (33), we can show that 



SU-Pl<i<+oo i/T+i'-^ /T' 

^ E (var(6)-var*(Cn+<^'-^')i?[(«f-l)I{-F.o(«)<x}] 



sup 

2:e[0,+oo) 



Thus we get 



i— m+l 



/ 1 



sup 1^/^(3;) — iPt{^)\ — sup 

a;G[0,+oo) 2;e[0,+oo) 



^ 00 

^ $](var(e.) - var*(en + a' - a^)E[{vj - l)I{J-^(«) < x}] 



i=0 



sup 

a;e[0,+oo) 



1 1 

25^ ~ 2^ 



^(var*(C;) - a^)E[{vf l)I{J-^(«) < x}] 



1 



- TTl l^^'^i^^ ) -(^^ - var* (4* ) + (J^ I sup 

l<i<m xe[0,+oo) 



sup 

xe[o,+oo) 



Y,E[{v^-m^oo{v)<x}] 

^ ^ (var(e;) - var*(e*) + - a2)ii;[(«| - l)I{-Foo(t') < x\ 



o, 



=o„ 



Or, 



Tm 



a-2 



It then follows that sup^^jQ |P(i^T(oo) < x) — P{F^{oo) < x)\ < sup^^gjg^+oo) IV'T(a;) — V't(^)I + 
Op(l/T) = Op{l/T). 
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;ure 2: Empirical rejection probabilities for the Wald statistic with the Bartlett kernel 
:t panel) and QS kernel (right panel) and for the AR(1) model with A^(0, 1) innovations 
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Figure 3: Empirical rejection probabilities for the Wald statistic with the Bartlett kernel 
(left panel) and QS kernel (right panel) and for the AR(1) model with t{3) innovations 
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